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Determinants of Investors? Information 
Acquisition: Credibility and Confirmation 


Jane Thayer 
The University of Georgia 


ABSTRACT: Psychology theory suggests that current investors are unlikely to be im- 
partial in their acquisition of information regarding a currently held position after receiv- 
ing information that casts doubt on the profitability of the position. This study examines 
whether the favorability of initial information an investor receives subsequent to taking 
an investment position moderates his acquisition of additional information regarding 
that position. A web-based experiment indicates that participants generally choose to 
view information based on its credibility. However, the majority of information chosen by 
participants initially receiving unfavorable information supports the investment position 
they chose at the beginning of the case. Moreover, participants receiving unfavorable 
information prefer to view lower-credibility, preference-consistent information at a simi- 
lar rate to higher-credibility, preference-inconsistent information. Given a psychological 
need to support a previous investment decision, participants forgo a certain amount of 
credibility in the information they gather to confirm their belief in their investment posi- 
tion. 


Keywords: investors; information acquisition; preferences; credibility; experiment. 
Data Availability: Contact the author. 


I. INTRODUCTION 
nvestors have an economic incentive to be objective and impartial in their acquisition of 
| as а high-quality set of information is central to an optimal investment decision. 
Psychology research suggests, however, that impartiality will be more difficult after an invest- 
ment position has been taken (e.g., Festinger 1957; Leventhal and Brehm 1962; Festinger 1964). 
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An underlying assumption of cognitive dissonance theory is that individuals prefer their actions to 
match their beliefs (Festinger 1957). Information received subsequent to taking a position influ- 
ences whether this internal balance is maintained (Festinger 1957, 1964). If information casts 
doubt on the future profitability of an investor's position, then psychology theory suggests that the 
investor may selectively seek additional information supporting the position to regain that balance 
(e.g., Frey 1986). This study examines whether the initial information an investor receives subse- 
quent to taking an investment position influences the type of information s/he acquires to make 
future judgments and decisions about the position. Will an investor acquire a balanced set of 
credible information or will s/he acquire information that confirms his/her investment position?! 

Investors' reactions to information based on its source credibility (e.g., Clement and Tse 2003; 
Gleason and Lee 2003) suggest that investors recognize the need to be accurate in their estimation 
of security prices. For example, investors react more quickly to forecast revisions made by celeb- 
rity analysts (Gleason and Lee 2003) and analysts employed by prestigious brokerage firms 
(Clement and Tse 2003) than to those made by lesser-known analysts.” In addition, investors react 
to the credibility of management forecasts in addition to the news, or surprise, in those forecasts 
(Jennings 1987; Mercer 2004; Hutton and Stocken 2009). Investors have also been shown to 
respond more strongly to earnings released by companies employing large, international audit 
firms compared to those of companies employing smaller audit firms (Teoh and Wong 1993). It is 
unknown, however, whether an investor's sensitivity to the credibility of incormation will continue 
after s/he receives initial information that casts doubt on a recently chosen, position. 

Although it is more difficult to remain objective after taking a position, research suggests that 
individuals are mindful of the credibility of information regarding the position, especially when 
the information is unfavorable (Ditto et al. 1998). However, given the choice between high- 
credibility information that is consistent with their preference to believe in the position (hereafter, 
preference-consistent information) and that which is inconsistent with their preferred belief (here- 
after, preference-inconsistent information), studies show than an individual’s need to support a 
desired belief will lead him/her to choose the former (Frey 1981, 1986).? These findings, however, 
result from a context that does not include an immediate economic incentive to gather a balanced 
set of credible information as an aid to decision making (Frey 1981). 

With regard to individuals’ willingness to trade off credibility for preference-consistency, 
studies suggest that individuals selectively seeking preference-consistent information will prefer to 
receive preference-inconsistent information if it has more credibility than available preference- 
consistent information (Lowin 1969). However, the psychology studies investigating individuals’ 
preferences for information provide participants with information varied between two extreme 
levels of credibility (e.g., Lowin 1967, 1969; Frey 1981; Ditto et al. 1998). Given the vast amount 
of information from various sources in the capital markets, it is unknown whether an investor 
would seek information based on its credibility or whether s/he would choose to acquire informa- 
tion supportive of his/her investment position. 


Regarding information acquisition, I use the term “balanced” to describe a representative sample of the information 
available from all sources, with no one viewpoint under- or over-represented in the sample, relative to its representation 
in the population of available information. 

Studies show that rankings and employer size are related to analysts’ forecast accuracy (Stickel 1992; Clement 1999; 
Hong and Kubik 2003); therefore, these signals can provide investors with a cognitive shortcut in determining the 
credibility of a particular forecast. 

The psychology literature uses the terms *preference-consistent" and "preference-inconsistent" to describe information 
that is congruent or incongruent, respectively, with an individual's preferred belief, regarziless of when it is received in 
a decision-making process (e.g., Ditto and Lopez 1992; Ditto et al. 1998; Ditto et al. 2003). However, for ease of 
exposition, I will use the terms "favorable" or “unfavorable” to describe the valence of the initial information an 
investor receives subsequent to taking an investment position and “preference-consistent” or "preference-inconsistent" 
to describe the valence of the additional information the investor seeks subsequent to receiving initial information. 


The Accounting Review January 2011 
American Accounting Association 


EC! и 


Determinants of Investors' Information Acquisition: Credibility and Confirmation 3 


The capital markets information environment is unique, in that investors often know—before 
investing in detailed analysis of a financial report or news article—whether information from a 
given source supports a particular investment position. For example, websites such as Yahoo! 
Finance and Dow Jones' Market Watch offer listings of analysts' broad recommendations (e.g., 
buy, hold, sell) for a given stock. Dow Jones’ Market Watch separates analysts’ recommendations 
by upgrades, downgrades, and initiations on its site, providing the name of the brokerage firm, the 
new recommendation, the old recommendation, and a brief set of comments summarizing the 
analyst's report. Therefore, investors can quickly determine the flavor of these reports and may, in 
certain cases, select information based on its level of support for a particular investment position. 
Moreover, investors’ ability to “pre-screen” information is not only limited to analyst reports, but 
is also inherent in searches for news and popular press articles through databases such as Factiva, 
ProQuest, and Lexis-Nexis. Federated search engines for business information, such as www.Bi- 
znar.com, also allow individuals to search a topic and receive an article title and source. 

Given the psychological need to find support for a belief in a prior action when the action is 
called into question (e.g., Festinger 1957, 1964; Frey 1986) and the nature of today's information 
environment, I predict that investors seeking support for a previous investment decision after the 
receipt of unfavorable information will acquire additional information that confirms their position. 
Specifically, I predict that these investors will choose high-credibility, preference-consistent infor- 
mation over high-credibility, preference-inconsistent information. Moreover, I predict that these 
individuals, who are in search of information to bolster their belief in their investment position, 
will forgo a certain amount of credibility to acquire support for that position. On the other hand, 
I predict that individuals who receive initial favorable information regarding their investment 
position will experience less of a need to support their position and will seek a balanced set of 
information from high-credibility sources. 

I use a web-based experiment to test my predictions about investors’ information acquisition 
after they receive initial information regarding a previously chosen investment position. Holding 
constant the initial information given to participants regarding their chosen investment, I manipu- 
late the favorability of that information by manipulating two variables between participants: (1) 
the participant's directional investment position (long versus short) and (2) the participant's earn- 
ings benchmark (high versus low). The randomly assigned investment position (long versus short) 
introduces a directional preference regarding the firm's earnings outcome, and the randomly as- 
signed earnings benchmark (high versus low) introduces a preference for a particular level of 
earnings. Those assigned a long (short) position are asked to choose a long (short) investment in 
one of two firms. ; 

After making their investment choice and learning their earnings benchmark, participants 
receive initial information about their investment and are provided an opportunity to view addi- 
tional information in the form of four analyst reports before making an EPS forecast. Two reports 
are issued by analysts employed by well-known brokerage firms (high-credibility sources) and two 
by unknown analysts (low-credibility sources). Within each set, one estimates high earnings and 
the other estimates low earnings. Participants can view as many or as few of the reports as they 
wish. The primary dependent measure is the time spent viewing each analyst report. 

Results generally support my predictions, as I find that the information sought by participants 
is a function of the favorability of the initial information they receive subsequent to taking an 
investment position. Holding constant the amount of preference-consistent information offered by 
high- and low-credibility sources, participants prefer to view high-credibility information. How- 
ever, participants receiving initial favorable information are more balanced in their acquisition of 
information based on its preference-consistency than are participants receiving unfavorable infor- 
mation. The majority of high-credibility information chosen by participants receiving unfavorable 
information is that which is preference-consistent. Additionally, participants receiving initial un- 
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favorable information prefer to view, at a similar rate, less credible information that is preference- 
consistent and more credible information that is preference-inconsistent. Consequently, partici- 
pants' earnings expectations are biased in the direction of the information -hey choose to view. 

These findings contribute to both the accounting and psychology literatures regarding the 
influence of individuals’ preferences on information acquisition and evaluation in three primary 
ways. First, my study presents a scenario in which the initial information an investor receives 
regarding a recent investment sets the course of his/her information-acquisition process. Results 
suggest that an investor receiving unfavorable information regarding his/her investment position 
will gather additional information in such a way that the resulting information set overly repre- 
sents information supporting his/her position. Additionally, some of the over-represented informa- 
tion may have less credibility than information that is under-represented. The outcome of this 
information-acquisition process is a set of information that can result in biased estimates and 
forgone profits. 

Second, the finding that investors are mindful of a source's viewpoint when gathering infor- 
mation has implications for archival researchers investigating market-pricing efficiency (e.g., Dia- 
mond and Verrecchia 1987; Figlewski and Webb 1993; Clement and Tse 2003; Gleason and Lee 
2003). The results of my study suggest that, under certain circumstances, investors are likely to 
consider an information source's viewpoint in addition to its credibility in reacting to the infor- 
mation (e.g., Clement and Tse 2003; Gleason and Lee 2003; Mercer 2004; Hutton and Stocken 
2009). Additionally, the results present a potential reason for findings that the market's pricing of 
firms with short-selling constraints, which by definition lack bearish traders, is less efficient than 
that for firms with short-sale opportunities (Diamond and Verrecchia 1987; Figlewski and Webb 
1993). 

Third, the study suggests that prior findings in psychology, which find that individuals receiv- 
ing preference-inconsistent information are sensitive to the credibility of that information and any 
additional information they might receive, represent a boundary condition (e.g., Lowin 1969; Frey 
1981; Ditto et al. 1998). The information offered to participants in these psychology studies is 
varied between two extreme levels of credibility, resulting in particicants choosing high- 
credibility, preference-inconsistent information over low-credibility, preference-consistent infor- 
mation (Frey 1981). On the other hand, the capital markets information environment offers inves- 
tors information from various sources representing a wide range of credibility. My study indicates 
that investors who are seeking support for a desired belief and who are presented with a choice of 
information from various sources are likely to accept some reduction in information credibility in 
order to gather support for their position. 

The remainder of the paper is organized as follows. Section II develops the theory and 
hypotheses. The experimental design and experimental results are presented in Sections III and IV, 
respectively. Section V summarizes and concludes. 


II. THEORY AND HYPOTHESES 

Source credibility is one way in which an investor's level of uncertainty in information can be 
reduced (Erdem and Swait 1998). This reasoning is supported by empirical evidence that investors 
react to information based on attributes of its source (e.g., Jennings 1987; Теоһ and Wong 1993; 
Clement and Tse 2003; Gleason and Lee 2003; Mercer 2004; Rogers and Stocken 2005; Hutton 
and Stocken 2009), which suggests that investors recognize the importance of forming an accurate 
estimation of a security's market price. These studies, however, do not consider the current posi- 
tion of an investor. Psychology theory suggests that current investors are likely to find it difficult 
to be impartial and objective in their search for information regarding a currently held position 
after receiving information that casts doubt on the profitability of the position (e.g., Festinger 


The Accounting Review January 2011 
American Accounting Association 


Determinants of Investors’ Information Acquisition: Credibility and Confirmation 5 


1957, 1964; Frey 1986). The current study examines whether the favorability of initial information 
an investor receives subsequent to taking an investment position moderates his/her acquisition of 
additional information regarding that position. 

Extant research in accounting and psychology does not provide a clear prediction of investors’ 
information-acquisition process in the course of making judgments and decisions regarding a 
currently held position. Hales (2007) finds that experimental participants, in the role of investors, 
interpret analyst forecast information based on their investment position. Specifically, Hales 
(2007) presents results suggesting that an investor’s earnings expectation for an investee firm is 
more likely to agree with analyst forecast information if that information indicates a high likeli- 
hood of a potential gain as compared to that of a loss. However, participants in Hales’ (2007) study 
were not provided with identifying information regarding the reporting analysts’ credibility. Re- 
search in psychology suggests that an investor receiving unfavorable information regarding his/her 
investment position will be particularly sensitive to the credibility of the information (Ditto et al. 
1998). In turn, s/he would adjust his/her beliefs if the unfavorable information was of high- 
credibility compared to low-credibility, favorable information. 

On the other hand, the psychology research investigating individuals’ sensitivity to informa- 
tion based on its credibility and preference-consistency manipulates credibility in an extreme, 
dichotomous manner (e.g., Lowin 1969; Frey 1981; Ditto et al. 1998). It is unclear how individu- 
als will choose to receive information provided by different sources when the credibility of those 
sources is varied at more intermediate levels (Lowin 1967). My study seeks a nuanced description 
of individuals’ information-acquisition process when there are preferences to support a desired 
belief and a more moderate amount of disparity in the various information sources’ credibility. 


Credibility and Preference Consistency 


Cognitive dissonance forms the basis for the effect of directional preferences on individuals’ 
information acquisition (Festinger 1957, 1964). Festinger (1957) describes dissonance as arising 
when two incompatible sets of knowledge, or cognitions, are held simultaneously. The need to 
reduce dissonance can lead individuals to be selective in their acquisition of additional informa- 
tion. Because of the opportunity for dissonance to arise upon the initial receipt of information 
subsequent to taking an investment position, I argue that the favorability of this information 
regarding the investor’s position sets the course of his/her information-acquisition process. 

The receipt of information supporting the likely profitability of a recently chosen investment 
position will corroborate the investor’s decision. As such, s/he should experience less need for 
supportive information, instead seeking a balanced set of information from high-credibility 
sources regardless of the source’s support of the investment position. On the other hand, the 
receipt of unfavorable information subsequent to an investment choice should arouse dissonance 
and the investor will, consequently, seek a reduction of the dissonance. One way that individuals 
can reduce dissonance is by searching for preference-consistent information (Brehm and Cohen 
1962). 

Research in psychology, however, suggests that investors selectively seeking preference- 
consistent information will prefer to receive preference-inconsistent information if it is of higher 
quality than available preference-consistent information (Frey 1981). This finding, along with 
research suggesting that individuals receiving high-credibility, preference-inconsistent information 
will update their preferred belief if additional preference-consistent information is low in credibil- 
ity (Ditto et al. 1998), suggests that individuals seeking additional preference-consistent informa- 
tion are sensitive to source credibility. Therefore, holding constant the amount of preference- 
consistent information available from high-credibility and low-credibility sources, I expect 
investors to acquire more high-credibility information than low-credibility information. This pre- 
diction is summarized below: 
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H1: Investors receiving initial favorable or unfavorable information regarding their position 

will acquire more information from high-credibility sources than; from low-credibility 

sources. А 

Findings in psychology suggest, however, that given a choice of information from two simi- 
larly credible sources, investors will prefer preference-consistent inforriation to preference- 
inconsistent information (Lowin 1967, 1969; Frey 1986). Such a finding would indicate that an 
investor is collecting an unbalanced set of information, which could result in the potential over- 
weighting of information supportive of his/her current position in future decision making. Addi- 
tionally, this result of selective acquisition would have implications for the generality of previous 
archival findings of investors' reaction to information based on source credibility (e.g., Clement 
and Tse 2003; Gleason and Lee 2003), as these studies do not control for tlle current position of 
investors. 

Regardless of an investor's economic incentives, I predict that the need lo support a previous 
investment decision will influence his/her information-acquisition process. I expect that an inves- 
tor receiving initial favorable information regarding his/her investment position will seek a bal- 
anced set of both preference-consistent and preference-inconsistent information from high- 
credibility sources. On the other hand, I expect an investor for whom initial information is 
unfavorable regarding his/her position to seek high-credibility information that is preference- 
consistent over that which is preference-inconsistent. These predictions are summarized below: 


H2a: Investors receiving initial favorable information regarding their position will acquire an 
equal set of preference-consistent and preference-inconsistent information from high- 
credibility sources. 


H2b: Investors receiving initial unfavorable information regarding their position will acquire 
more preference-consistent information than preference- inconsiste nt information from 
high-credibility sources. | 

Credibility versus Preference Consistency | 

Previous research examining individuals' selective exposure to informati tion shows that indi- 
viduals tend to select high-credibility, preference-inconsistent information over low-credibility, 
preference-consistent information (Frey 1981). This is consistent with findings that information 
usefulness can influence information search over an individual's need to reduce dissonance (Wick- 
lund and Brehm 1976). The fear of invalidity or a judgmental mistake can lead individuals to seek 
additional information they view as useful before making a conclusion (Kruglanski and Webster 
1996). 

The information presented to experimental participants in the psychology studies of selective 
exposure to information based on source credibility is varied between two extreme levels of 
credibility (e.g., Lowin 1969; Frey 1981). For example, both Lowin (1989) and Frey (1981) 
present to experimental participants information that is designed to be either maximally easy or 
hard to refute. These studies manipulate the messages’ ease of refutation by varying the presumed 
expertise of the information source. Participants were told that the messages were provided by 
either an expert economist (i.e., low ease of refutation) or by high school sophomores (i.e., high 
ease of refutation). р 

The information environment manufactured in the psychology studies described above is very 
different from the capital markets information environment. With the extensive amount of infor- 
mation available to investors in the marketplace, the sources of information are also vast. Given 
the availability of information from various sources representing a wide range of credibility, an 
investor may decide to trade off some amount of credibility in order to gather information sup- 
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portive of his/her investment position, especially after the receipt of information that casts doubt | 
on the future profitability of his/her position. I predict that investors receiving initial information 
that is unfavorable regarding their position will not acquire preference-inconsistent information 
from a high-credibility source at a greater rate than preference-consistent information from a 
low-credibility source. This prediction is summarized below: 


H3: Investors receiving initial unfavorable information regarding their position wil] not ac- 
quire more high-credibility, preference-inconsistent information than low-credibility, 
preference-consistent information. 


ПІ. EXPERIMENT 

Participants 

To test my hypotheses, I conduct a web-based experiment with 92 second-year students from 
a BusinessWeek top-25 M.B.A. program taking on the role of investors. Approximately 58 percent 
of the participants report that they have made investments in individual firm stocks and 99 percent 
indicate that they plan to do so in the future. Those who have invested in individual stocks report 
having purchased a mean (median) of 55 (20) stocks. Additionally, 79 percent of participants 
report having previously invested in stock mutual funds. 


Experimental Procedures 


I designed a web-based experimental platform to test whether an investor's need to support an 
investment decision influences the type of information s/he acquires before completing a basic 
earnings-forecasting task. Each participant chooses one of two firms in which to invest. À choice 
is included in the task in order that the decision instills in participants a commitment to their 
investment position (Brehm and Cohen 1962). In making an investment choice, participants are 
expected to feel responsible for the future consequences of their decision. 

Participants are randomly assigned to either a long position or a short position before choos- 
ing a firm in which to invest. Participants view information regarding each of the firms in order to 
make their choice. The information is not only designed to present the firms as equally attractive, 
but is also ambiguous regarding the future profitability of the firms.’ Information presented to 
participants is held constant across all conditions. After making their choice, participants are 
randomly assigned either a high or low earnings-per-share benchmark to which actual EPS are 
compared in determining if they gain or lose from their individual investment. 

Participants’ choice of Firm X or Firm Y has no effect on the information presented subse- 
quently. That is, all participants see the same information after their investment choice, with the 
exception of the firm’s name. Participants receive quarterly earnings information for the past three 
years, recent historical stock prices over the same period, and the consensus analyst forecast for 
the upcoming fiscal year, including the range of analysts’ individual forecasts. The consensus 
analyst forecast is either favorable or unfavorable, based on the individual participant’s assigned 
investment position and earnings benchmark. After receiving this information, participants have an 
opportunity to view any or all of four analysts’ reports before making their final EPS estimate. 
Figure 1 presents a timeline of the experimental task. See the Appendix for an example of the 
wording of the experimental manipulations. 


^ Brehm (1956) finds that post-choice dissonance increases as the attractiveness of the rejected alternative increases. An 
individual choosing between two attractive alternatives has a choice to make; however, an individual selecting between 
one attractive alternative and one unattractive alternative will likely feel as if s/he does not have a choice. 
This information, including earnings information, is modeled after that of an actual publicly traded firm. 
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FIGURE 1 
Timeline of Experimental Task 
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Instructions Investment EJ Е icd Additional Final EPS Post- 
Choice" Benchmark ^ Information EPS Information Forecast Experiment 
icd Acquisition * Questionnafre 


Participants ici, Participants Participants Participants are Participants Manipulation 
are provided are provided i ER Я are asked to provided three nre asked to success is 
with task i with an information and makea minutes to víew make a final checked. 
instructions ice. earnings recent financial preliminary four individual Jorecast of Demographic 
including an Investment benchmark, information is forecast of analyst reports. the firm's information is 


overview of position is which is provided, The the firm’s tinnual EPS, collected. 
compensation manipulated. manipulated consensus annual EPS, 
structure. between between analyst forecast 
participants. participants. of $3.80 is 
presented. 





Experiment participants assigned to the long (short) investment position condition were asked to choose a 
long (short) investment in one of two firms. Prior to making their investment choice, all participants viewed 
the same information regarding the two firms. 

Participants were randomly assigned an earnings benchmark of either $3.76 or $3.84. Those assigned to the 
long investment position condition were told that they would receive a higher (lawer) payoff when actual 
earnings were greater (less) than their earnings benchmark. Participants assignec to the short investment 
position condition were told that they would receive a higher (lower) payoff when actual earnings were less 
(greater) than their earnings benchmark. | 

The amount of tirne a participant spends viewing each of the four analyst reports is the main dependent 
measure. 


Experimental Treatments 


I employ a 2 X 2 full-factorial, between-participants design that manipulates (1) the partici- 
рап? s investment position (long versus short) and (2) their earnings benchmark (low versus 
high). As in Hales (2007), each participant's payoff is either a positive or a negative function of 
the subject firm's earnings, depending on his/her randomly assigned investment position (long 
versus short). After participants make their investment choice, they are randomly assigned an 
earnings-per-share benchmark of either $3.76 (low) or $3.84 (high). Those in the long investment 
position condition experience a higher (lower) payoff when actual earnings are greater (less) than 
their earnings benchmark. Participants in the short investment position condition experience a 
higher (lower) payoff when actual earnings are less (greater) than their earnings benchmark. The 
combination of participants" randomly assigned investment position and earnings benchmark de- 
termines whether the initial information received subsequent to their invesument choice (i.e., the 
consensus analyst forecast of $3.80) is favorable or unfavorable." 

After receiving the consensus analyst forecast, participants receive four analysts' forecasts and 
a link to view each analyst's corresponding report. Two of the individual forecasts, one from a 
well-known equity research firm and one from an unknown equity research firm, estimate annual 


The manipulation of the two independent variables is based on Hales (2007). I am grateful to Jeffrey Hales for his 
generosity in allowing me access to his experimental instrument. 

The consensus analyst forecast of $3.80 foretells that participants assigned to the long !short) investment position 
condition and an earnings benchmark of $3.76 ($3.84) will experience a gain with regard to their investment position. 
However, this consensus analyst forecast foretells that participants assigned to the long “short) investment position 
condition and an earnings benchmark of $3.84 ($3.76) will experience а loss based on their; investment position. 
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earnings to be at the top of the range of all analysts' individual forecasts (i.e., $3.88). The other 
two forecasts, one from a well-known firm and one from an unknown firm, estimate annual 
earnings at the bottom of the range of forecasts (i.e., $3.72). Results from a separate experiment 
indicate that the well-known firms are expected to provide more credible research than the un- 
known firms. 

The four forecasts are presented in a 2 X 2 matrix rather than a list in order to prevent 
participants from choosing the analyst reports sequentially. In addition, the computerized program 
randomizes, across participants, the placement of the forecasts in the matrix. In order to view a 
particular report, a participant must click on the link associated with that report. Participants are 
given up to three minutes to view any or all of the four analyst reports; however, participants who 
spend less than the allotted time are awarded bonus points. Therefore, at any point during these 
three minutes, participants have the opportunity to stop viewing this information and move for- 
ward in the task where they are asked to provide their final EPS forecast. 


Dependent Measures 


After receiving initial information (i.e., the consensus analyst forecast) that is either favorable 
or unfavorable regarding their investment position, participants have an opportunity to view four 
individual analyst reports. I measure the amount of time a participant spends viewing each of the 
four reports. This allows me to examine whether participants are choosing to view information 
based on its credibility or its preference consistency. An analyst report will be preference consis- 
tent, or will bolster the participant's preferred belief that s/he made a wise investment choice, if it 
suggests that s/he will experience a gain from the investment (i.e., more optimistic forecasts if a 
long position is held; less optimistic forecasts if a short position is held). The analyst report will 
be preference-inconsistent if it casts doubt on the future profitability of the participant's chosen 
investment position (ie., less optimistic forecasts if a long position is held; more optimistic 
forecasts if a short position is held). 

Participants’ individual viewing times for each report allow me to examine their relative 
preferences to acquire the information. As the amount of preference-consistent information pro- 
vided by high- and low-credibility sources is held constant, H1 predicts that all participants will 
spend more time viewing high-credibility information than low-credibility information. Hypoth- 
esis 2a predicts that participants for whom the consensus analyst forecast is favorable will spend 
equal time viewing the high-credibility, preference-consistent and high-credibility, preference- 
inconsistent analyst reports. Hypothesis 2b predicts that participants for whom the consensus 
analyst forecast is unfavorable will spend more time viewing the high-credibility, preference- 
consistent analyst report than the high-credibility, preference-inconsistent analyst report. Last, H3 
predicts that participants for whom the consensus analyst forecast is unfavorable will not spend 
more time viewing the high-credibility, preference-inconsistent analyst report than the low- 
credibility, preference-consistent analyst report. 


5 Ina separate experiment, 41 participants from the same population as that examined in the main experiment rated the 
credibility of equity research they would expect to receive from each of these four firms. A five-point Likert scale was 
used with 1 being "not at all credible" and 5 being "highly credible." Mean ratings for the two well-known firms were 
4.20 and 4.27. The difference in these ratings is not significant (ty = 0.83, р = 0.41). The two unknown firms received 
mean ratings of 3.22 and 3.27. The difference in these ratings is not significant (t49—0.36, p = 0.72). However, the 
average credibility rating for the two well-known firms (mean -- 4.23) was significantly greater than that for the 
unknown firms (mean = 3.24) (ty = 4.60, р < 0.001). Going forward, I use the labels “low-” and “high-” credibility to 
connote that the rating of the credibility of the research issued by the two unknown firms was significantly less than the 
rating of the credibility of the research issued by the two well-known firms. As such, these labels are not meant to 
convey absolute levels of credibility but, instead, relative levels of credibility. 
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Compensation 


Participants receive compensation based on the number of points earned from four sources. 
They receive 25 participation points at the beginning of the case. Additionally, they earn points 
based on the accuracy of their final EPS forecast. If actual earnings equal the participant's fore- 
casted earnings, then s/he earns 25 points; however, one point of the 25 "accuracy points" is 
deducted for every penny that his/her forecast differs from actual earnings. This component of the 
compensation scheme provides participants an incentive to make an accurate forecast. 

Additionally, participants earn or lose up to 25 points based on the outcome of their invest- 
ment position. Those holding a long position earn (lose) two points for every penny that actual 
earnings-per-share exceeds (falls below) their earnings benchmark. Similarly, participants holding 
a short position earn (lose) two points for every penny that actual earnings-per-share falls below 
(exceeds) their earnings benchmark. АП gains or losses from the participant's investment position 
are capped at 25 points. This component of the compensation scheme captures investors' incentive 
to hold a profitable investment.” | 

Lastly, participants have an opportunity to receive bonus points if they do not use the entire 
three minutes allotted to view the analyst reports. A three-minute timer is provided at the top of the 
computer screen in order for participants to keep track of their time. The timer clicks down to zero 
before automatically advancing to the next section of the study where participants are asked to 
make their final EPS forecast. If participants choose to move forward with 60 seconds or more on 
the timer, they are awarded ten bonus points. Participants leaving between;30 and 59 seconds on 
the timer receive five bonus points. These points are added to the participant's total points at the 
end of the study. This component of the compensation scheme captures the effort expended in 
information acquisition. 

In sum, participants have an opportunity to earn up to 85 points. At the end of the case, each 
participant's points are converted into U.S. dollars. Participants are guaranteed payment of $5, but 
could earn up to $25. On average, participants receive compensation equal to $16." 


IV. RESULTS i 
Manipulation Checks 


To verify that participants attended to the manipulation of investment position (i.e., long 
versus short), I asked them in a post-experiment questionnaire to indicate whether they gained or 
lost points if the firm in which they were invested experienced an increase in earnings. > This 
manipulation was successful: 87 percent (41/47) of participants assigned a long investment posi- 
tion reported tha: they gained points if earnings rose, while only 7 percent (3/45) of participants 


? To ensure that participants understand the accuracy incentive and profitability incentive components of the compensa- 
tion scherne, they are required to answer correctly five questions before moving forward in the task. 

Studies of selective exposure to information generally force participants to choose betwyeen sets of supportive and 
non-supportive information (e.g., Lowin 1967; Frey and Wicklund 1978; Frey and Stahlterg 1986). Instead of incor- 
porating a similar choice into my study, I implement the time cost. The costliness of analyzing all information should 
induce choice. Additionally, time spent in an information-processing task is highly correlated with characteristics of 
effort (Bettman et al. 1990; Sprinkle 2000). Therefore, the time participants spend viewing the analyst reports is a 
measure of the effort they are willing to expend in gathering additional information. 

Compensation was paid at a rate of approximately $0.29 per point earned, with a minimum compensation equal to 
$5.00. Participants earned an average of 54 points. 

Prior to making their investment choice, all participants viewed the same information regarding the two firms. Partici- 
pants’ investment position condition did not influence their investment choice (х2(1) = 0\69, р = 0.41). In addition, 
participants’ compensation, based in part on their EPS estimate, was not affected by their investment choice (too 
= 0.08, р = 0.94). Participants choosing Firm X earned an average of 53.6 points and received approximately $16 in 
compensation, wh:le participants choosing Firm Y earned an average of 53.7 points and als received approximately $16 
in compensation. i 
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assigned a short investment position used this response. The difference is significant (Fisher's 
Exact Test; p « 0.001). To verify that participants attended to the difference between the consen- 
sus analyst forecast (і.е., $3.80) and their earnings benchmark (i.e., $3.76 versus $3.84), I asked 
them to provide both amounts. Eighty-nine percent (42/47) of those with an earnings benchmark 
of $3.76 provided a response that was within $0.02 of their benchmark. Ninety-six percent (43/45) 
of those with an earnings benchmark of $3.84 provided a response that was within $0.02 of their 
benchmark. Approximately 97 percent (89/92) of all participants were within $0.02 of the correct 
consensus analyst forecast. Most importantly, 94 percent (86/92) of all participants attended to the 
direction of the consensus analyst forecast in relation to their earnings benchmark. Participants' 
awareness of the consensus analyst forecast being higher or lower than their earnings benchmark 
and their responses indicating their understanding that they either gained or lost points when actual 
earnings increased indicates that the manipulation of the favorability of the consensus analyst 
forecast was successful.’ 


Mixed-Design ANOVA 


As the objective of the study is to examine the effect of the favorability of initial information 
received subsequent to an investment decision (1.е., the consensus analyst forecast of $3.80) on 
investors’ viewing time of additional information, I collapse the four experimental conditions into 
two conditions based on favorability and report the results accordingly. The Favorable condition is 
comprised of participants assigned to the long (short) investment position with an earnings bench- 
mark of $3.76 ($3.84). The Unfavorable condition is comprised of participants assigned to the 
long (short) investment position with an earnings benchmark of $3.84 ($3 .76).'* І then analyze the 
information using a mixed-design analysis of variance with Favorability as a between-participants 
variable and the Credibility (high credibility versus low credibility) and Preference Consistency 
(preference-consistent versus preference-inconsistent) of the analyst reports as two within- 
participants variables. 


Main Effects of Favorability, Credibility, and Preference Consistency on Viewing Time 

After receiving initial information regarding their chosen investment positions, participants 
had the opportunity to view four analyst reports. Based on the participant's assigned experimental 
condition, each report is one of four types: (1) a high-credibility, preference-consistent report, (2) 
a high-credibility, preference-inconsistent report, (3) a low-credibility, preference-consistent re- 
port, and (4) a low-credibility, preference-inconsistent report. I expect that participants! viewing 
time of each of these reports will be influenced by the favorability of the initial information they 
receive regarding their position. Table 1 presents descriptive statistics of participants' viewing time 
based on Favorability and the type of analyst report (Credibility, Preference Consistency). 

The last column of Table 1 shows that participants in the Favorable condition spent, on 
average, 66.59 seconds viewing the analyst reports, while those assigned to the Unfavorable 
condition spent an average of 90.48 seconds. The mixed-design ANOVA model presented in Table 
2 shows the effect of Favorability on participants’ viewing time to be significant (F; 99 = 6.15, p = 
0.02). This result is consistent with the findings in psychology that individuals spend more time 
and effort processing information after the receipt of unfavorable information (e.g., Kruglanski 
1980, 1990; Ditto and Lopez 1992). 


В The results reported in the paper are inferentially and statistically similar to the analysis excluding participants who 
answered the manipulation check questions incorrectly. 

МІ report all significant differences resulting between the two experimental conditions comprising either of the two 
Favorability conditions. 
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TABLE 2 
Mixed-Design ANOVA for the Viewing Time of Analyst Reports 
Two-Tailed 
А df SS MS F p-value 

Between-Participants Effects: 

Favorability 1 3275.03 3275.03 6.15 0.02 

Error 90 47937.65 532.64 
Within-Participant Effects: 

Credibility 1 14028.97 14028.97 27.17 < 0.001 

Credibility X Favorability 1 23144 23144 045 0.51 

Error (Credibility) 90 46464.85 516.28 
Preference Consistency 1 4843.65 4843.65 847 < 0.01 
Preference Consistency X Favorability 1 757231 7572.34 1325 < 0.001 
Error (Preference Consistency) 90 51443.12 571.59 
Credibility X Preference Consistency 1 860.96 860.96 2.16 0.15 
Credibility X Preference Consistency X Favorability 1 5094.38 5094.38 1276 < 0.001 
Error (Credibility X Preference Consistency) 90 35935.18 399.28 


The four experimental conditions are collapsed into two conditions based on the Favorability of the initial information (i.e., 
consensus analyst forecast of $3.80). The Favorable condition is comprised of participants assigned to the long (short) 
investment position with an earnings benchmark of $3.76 ($3.84). The Unfavorable condition is comprised of participants 
assigned to the long (short) investment position with an earnings benchmark of $3.84 ($3.76). I analyze the information 
using а mixed-design ANOVA model with Favorability as а between-participants variable and the Credibility (high 
credibility versus low credibility) and Preference Consistency (preference-consistent versus preference-inconsistent) of the 
analyst reports as two within-participants variables. 


Holding constant the amount of preference-consistent information offered by high-credibility 
and low-credibility sources, H1 predicts that all participants will spend a greater amount of time 
viewing high-credibility reports than low-credibility reports. Consistent with H1, untabulated re- 
sults show that participants spent an average of 51.96 seconds viewing reports issued by analysts 
employed by the two high-credibility equity research firms, compared to an average of 27.10 
seconds viewing the reports issued by analysts employed by the two low-credibility firms. This 
difference leads to the significant main effect of Credibility on participants' viewing time of the 
four reports (Б, 90 = 27.17, p < 0.001), as noted in Table 2. 

The mixed-design ANOVA in Table 2 also shows a main effect of Preference Consistency on 
participants’ viewing time of the four reports (Е, 99 = 8.47, р < 0.01). This main effect, however, 
is qualified by a two-way interaction of Preference Consistency and Favorability on information 
viewing times (F; = 13.25, р < 0.001). Compared to participants in the Favorable condition, 
participants in the Unfavorable condition allocated more of their viewing time to preference- 
consistent information than to preference-inconsistent information. Participants in the Favorable 
condition spent approximately the same amount of time viewing the preference-consistent analyst 
reports (mean = 31.48 seconds) and preference-inconsistent analyst reports (mean = 35.11 sec- 
onds) (t43 = 1.01, p = 0.32). However, participants in the Unfavorable condition spent significantly 
more time viewing the preference-consistent analyst reports (mean = 61.58 seconds) than the 
preference-inconsistent analyst reports (mean = 28.90 seconds) (t4; = 3.65, p = 0.001). 


Credibility and Preference Consistency 


Although there is a significant main effect of Credibility on the amount of time participants 
spent viewing the analyst reports, this main effect is qualified by a significant three-way interac- 
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tion of Credibility, Preference Consistency, and Favorability (Е, 99 = 12.76, p < 0.001). The sig- 
nificant interaction suggests that participants in the Favorable and Unfavorable conditions allocate 
their viewing time differently across the four analyst reports based on source credibility and the 
preference-consistency of the report. That is, the two-way interaction of Credibility and Preference 
Consistency on pzrticipants' viewing time varies across the two Favorability conditions. Panels A 
and B in Figure 2 show graphical presentations of the average viewing times of each of the four 
analyst reports for participants in the Favorable Condition and Unfavorable Condition, respec- 
tively. 

Hypothesis 2a predicts that participants in the Favorable condition will be balanced in their 
acquisition of high-credibility information, choosing to view an equal amount of preference- 
consistent and preference-inconsistent information. However, H2b predicts that participants in the 
Unfavorable condition will prefer to view more high-credibility, preference-consistent information 
than high-credibility, preference-inconsistent information. 

Simple effect tests indicate that participants in the Favorable condition spent more time 
viewing the high-credibility, preference-inconsistent analyst report (mean =; 25.14 seconds) than 
the high-credibility, preference-consistent analyst report (mean = 18.93 seconds) (t44 = 2.05, p = 
0.05). Although this finding does not support H2a, it does not necessarily suggest that participants 
receiving favorable information regarding their investment position lack objectivity in their acqui- 
sition of additionzl information. Instead, this result is redolent of investors’ openness to credible, 
contrary information (Festinger 1957, 1964; Frey 1986). Additional analysis, indicates that partici- 
pants in the Favorable condition spent equal time viewing the preference-consistent (mean = 
12.55 seconds) and preference-inconsistent reports (mean = 9.98 seconds) issued by low- 
credibility analysts (из = 1.00, р = 0.33). 

Turning next to participants assigned to the Unfavorable condition, І find that with both 
high-credibility and low-credibility analyst reports, participants preferred toi view the preference- 
consistent report over the preference-inconsistent report. Specifically, participants in the Unfavor- 
able condition spent, on average, 43.02 seconds viewing the high-credibility,' preference-consistent 
analyst report compared to an average 16.17 seconds viewing the high-ctedibility, preference- 
inconsistent analyst report (t4;=3.63, р < 0.001). This result supports H2b. Similarly, these 
participants spent. on average, 18.56 seconds viewing the low-credibility. preference-consistent 
report compared to an average 12.73 seconds viewing the low-credibility, preference-inconsistent 
report (t47 = 1.88, р = 0.07). 


Credibility versus Preference Consistency 

Results indicate that participants in the Unfavorable condition seek high-credibility, 
preference-consistent information over high-credibility, preference-inconsistent information, con- 
sistent with H2b. However, I expect these individuals' need to gather information supporting their 
position to be great enough to counteract the need to gather high-credibility information. Contrary 
to findings in psvchology (e.g., Frey 1981, 1986), H3 predicts that investors receiving initial 
information that is unfavorable regarding their position will not acquire more high-credibility, 
preference-inconsistent information than low-credibility, preference-consistent information. 

Results support НЗ. Table 1 shows participants in the Unfavorable condition spent an average 
of 18.56 seconds viewing the low-credibility, preference-consistent analyst report compared to an 
average of 16.17 seconds viewing the high-credibility, preference-inconsistent analyst report. The 
difference in these viewing times is not significantly different (t4; = 0.70, р = 0.49). This result is 
compelling as previous findings in psychology suggest that individuals receiving unfavorable 
information prefer additional information that is credible to that which is preference-consistent 
(e.g., Frey 1981, Ditto et al. 1998). In the current study, however, only participants in the Favor- 
able condition favored credibility over preference-consistency. Putting tiis result into further 
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FIGURE 2 
Graphical Depiction of Viewing Time of Four Analyst Reports 


Panel А: Favorable Condition? 
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а The Favorable condition is comprised of participants assigned to the long (short) investment position with an 

earnings benchmark of $3.76 ($3.84). The Unfavorable condition is comprised of participants assigned to the 
long (short) investment position with an earnings benchmark of $3.84 ($3.76). 
An analyst report is preference-consistent with a participant's assigned long (short) investment position ifthe 
reporting analyst forecasts earnings to be above (below) the participant's earnings benchmark. Conversely, an 
analyst report is preference-inconsistent with a participant's assigned long (short) investment position if the 
reporting analyst forecasts earnings to be below (above) the participant's earnings benchmark. 


The Accounting Review January 2011 
American Accounting Association 


16 | Thayer 


context, participants in the Favorable condition spent an average of 12.55 seconds viewing the 
low-credibility, preference-consistent analyst report compared to an average of 25.14 seconds 
viewing the high-credibility, preference-inconsistent analyst report. This disference is highly sig- 
nificant (t43 = 3.82, р < 0.001). 


| 


Forecasted Earnings 


In order to examine whether selective information acquisition influences subsequent judg- 
ments, participants were assigned the task of forecasting annual earnings-per-share for the subject 
firm. Participants provided a preliminary EPS forecast after receiving the consensus analyst fore- 
cast of $3.80 but prior to viewing any of the analyst reports. The average preliminary EPS forecast 
of participants in each of the four conditions was not different from the consensus analyst forecast 
of $3.80 (all p > 0.20). After having the option to view additional information in the analyst 
reports, participants provided their final forecasts. Table 3 presents the тегт forecasts of partici- 
pants in each of the four experimental conditions. 

The mean forecast of participants assigned to the long investment posttion was $3.85, while 
the mean forecast of participants in the short investment position conditioa was $3.78. The dif- 
ference of $0.07 is significant (Е, gg = 51.21, р < 0.01). This result replicates the general findings 
of Hales (2007) that investors’ directional preferences regarding an investment position influence 
their interpretation of information regarding the investment." 

Although the average EPS forecast of participants in the Unfavorable, Long condition was 
greater than the average of those in the Favorable, Long condition, the difference is not significant 
(meanypfavorable, Long = $3.86; Meanfavorable, Long = $3.84) (145:= 1.01, one-tailed p-value = 0.16). 
There is a significant difference, however, in the EPS forecasts of participants in the Unfavorable, 
Short and Favorable, Short conditions (t44 = 1.65, one-tailed p-value = 0.05), with participants in 
the Unfavorable, Short condition forecasting EPS to be lower than those in the Favorable, Short 
condition (meanypfavorable, Short == $3.77; MECANgayorable, Short = $3.79)."° 

It is interesting to note that while participants in the Unfavorable condition spent more time 
gathering additional information than did those in the Favorable conditidn, participants in the 


15 The results are reported in absolute time (seconds). Due to differences in the total amount of time participants in the 
Favorable and Unfavorable conditions spent viewing the four reports, Ї do not test differences in absolute time spent 
viewing individual zeports across the two conditions. However, in calculating the percentage/of time participants in both 
conditions allocated to each report, I find that participants in the Favorable condition spent, on average, 16 percent of 
their viewing time on the low-credibility, preference-consistent report. This is less than the 21 percent of viewing time, 
on average, allocated to the low-credibility, preference-consistent report by participants in! the Unfavorable condition 
(tyg= 1.48; р < 0.07, one-tailed). On the other hand, participants in the Favorable condition spent, on average, 39 
percent of their viewing time on the high-credibility, preference-inconsistent report, сошрагей to the 21 percent spent, 
on average, by participants in the Unfavorable condition. This difference is significant (t; 3.85; p « 0.001). 

This suggests that the pattern of participants’ additional information search was not due to!differing expectations. 
Hales (2007) predicted and found that individuals assigned to hold a long investment position forecasted earnings to be 
higher than that forecasted by those assigned to hold a short investment position when evaluating information that 
implied a loss on their respective investments. However, Hales (2007) found that the earnings forecasts of individuals 
evaluating information that implied a gain on their respective investments were not significantly different. My findings, 
however, indicate a difference in the estimates of participants in the Favorable, Long conditipn and those of participants 
in the Favorable, Short condition (ty. = 3.96, р < 0.01). A possible reason for this difference is the information 
participants had an opportunity to view. The experimental design included high-credibility| sources forecasting annual 
earnings at both the top and bottom of the range of individual analysts’ forecasts. Therefore, migh-credibility information 
was available for those holding a long (short) position to support a higher (lower) earnings forecast. 

Forecast errors (1.е., Forecasted EPS — Actual EPS) are presented in Table 3. Actual EPS was $3.81. Participants in the 
Unfavorable, Short condition had an average forecast error of —$0.04, where participants in the Favorable, Short 
condition had an average forecast error of —$0.02. The difference in these average forecast errors is significant (t4 
= 1.65, one-tailed p-value — 0.05). Participants in the Unfavorable, Long condition had ап average forecast error of 
$0.05, where those in the Favorable, Long condition had an average forecast error of $0.03. The difference in these 
average forecast errors is not significant (t45 = 1.01, one-tailed p-value — 0.16). 
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TABLE 3 
Participants’ Final Forecasted Earnings-Per-Share 
Investment Position Collapsed across 
Earnings SS ee Investment 
Benchmark Short Position 
$3.76 $3.77 $3.81 
(0.04) (0.06) 
—$0.04 n 47 
$3.84 $3.83 
(0.06) 
п = 45 
Collapsed Across Earnings $3.85 $3.78 $3.82 
Benchmark (0.05) (0.04) (0.06) 
п 47 п = 45 п = 92 


Shaded cells are Favorable Conditions. Cells not shaded are Unfavorable Conditions. 

Participants were given the task of forecasting annual EPS for the subject firm. After receiving background information, 

including the last three years’ earnings information, historical stock prices, and the consensus analyst forecast of annual 

EPS ($3.80), participants had an opportunity to view four analyst reports. After viewing this information, participants made 

their EPS forecast. This task provides an opportunity to examine the differential effects of the information viewed on 

participants’ earnings expectations. 

* The numbers in parentheses represent standard deviations. . 

> The numbers in italics represent the average forecast error (i.e., Forecast Error = Forecasted EPS — Actual EPS). Actual 
EPS was $3.81. 





Unfavorable condition provided final EPS forecasts that were, on average, less accurate than those 
in the Favorable condition. These findings suggest that, in spite of the additional time participants 
in the Unfavorable condition spent gathering information, their choice to view additional 
preference-consistent information over a balanced set of both preference-consistent and 
preference-inconsistent information led them to overestimate (underestimate) the earnings of the 
firm in which they held the long (short) position. 

Lastly, participants’ final EPS forecasts are highly correlated with the percentage of total time 
spent viewing reports issued by analysts forecasting high earnings relative to the consensus (p = 
0.56, p < 0.01) and the percentage of total time spent viewing reports issued by analysts fore- 
casting low earnings relative to the consensus (p = —0.57, p « 0.01). Therefore, participants' 
acquisition of information, which was influenced by the favorability of initial information regard- 
ing their individual investment positions, is consequential to their final EPS forecasts. Regardless 
of the fact that participants' compensation was largely determined by the accuracy of their fore- 
cast, participants’ final EPS forecasts were biased in a manner consistent with their preferred 
outcome. 


V. CONCLUSION 
This study examines whether an investor's psychological preference for information that 
supports a recent investment decision dominates his/her economic incentive to gather a balanced 
set of credible information for purposes of future decision making. I argue that the favorability of 
initial information an investor receives subsequent to taking an investment position sets the course 
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of his/her information-acquisition process. Given the psychological need to support a belief in a 
prior action when that action is called into question (Festinger 1957), І predict that investors 
receiving unfavorable information regarding a recent investment decision will seek additional 
information that substantiates their decision, even if it requires them to fo-feit a certain amount of 
credibility in thzt information. 

Results from a web-based experiment support my predictions, as 1 find that the type of 
information viewed by participants is a function of the favorability of the initial information they 
received subsequent to making an investment decision. Participants who initially received favor- 
able information chose to view a more balanced set of preference-consistent and preference- 
inconsistent information compared to participants receiving unfavorable information. Participants 
receiving initial unfavorable information spent a majority of their viewing time on preference- 
consistent information, or information that supported the investment position they chose at the 
beginning of the case. Finally, the participants receiving initial unfavorable information preferred 
to view at a similar rate less credible information that was preference-consistent and more credible 
information that was preference-inconsistent.'? Consequently, participants’ earnings expectations 
were biased in the direction of the information they chose to view. 

While archival studies can explore the impact of investors’ decisions through stock returns, 
these studies cannot easily obtain evidence of the processes that precede individuals’ investment 
decisions. An experimental design allows me to manipulate the type of initial information inves- 
tors receive subsequent to making an investment decision (ie., favorable versus unfavorable), 
while controlling the type of additional information available to them, ќо examine factors that 
influence information acquisition. In turn, this study allows more nuanced conclusions about 
investors' use of information relative to extant archival research. 

This study contributes to the accounting and psychology literatures examining individuals' 
search and use of information in three primary ways. First, the results suggest that an investor who 
initially receives information that casts doubt on a recent investment decision will not gather a 
balanced set of credible information, but will seek information that bolsters his/her chosen posi- 
tion, The resulting information set overly represents information supporting his/her position. Ad- 
ditionally, some of the over-represented information may have less credibility than information 
that is under-represented. The outcome of this information-acquisition process is a set of informa- 
tion that can result in biased estimates and forgone profits. 

Second, this study contributes to the accounting literature examining the influence of source 
credibility on investors' use of information (e.g., Teoh and Wong 1993; Clement and Tse 2003; 
Gleason and Lee 2003; Mercer 2004; Hutton and Stocken 2009). In general, extant research in this 
area does not consider the current investment position of the individual seeking information. My 
study suggests that investors are mindful not only of the credentials of the information source, but 
also of the source's viewpoint. As such, the results have implications for the generality of findings 
that investors' reaction to information is often based on source credibility (e.g., Teoh and Wong 
1993; Clement and Tse 2003; Gleason and Lee 2003). Results also suggest a potential reason for 
the finding that ће market's pricing of firms with short-selling constraints 18 less efficient than that 
for firms with short-sale opportunities (Diamond and Verrecchia 1987; Figlewski and Webb 1993). 
This inefficiency may stem from bullish investors' Jack of attention to bad news. 


ue Although I find a difference in participants’ perceived credibility of the available analyst reports, it is possible that their 
familiarity with the two well-known brokerage firms led them to rate the credibility of those reports as higher than those 
issued by two unxnown firms. However, given the findings that analysts’ forecast accuracy is positively associated with 
employment at large, high-status brokerage firms (e.g., Clement 1999; Hong and Kub:k 2003), I do believe that an 
analyst’s association with a particular employer speaks to his/her credibility. 
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Last, my study contributes to the psychology literature, as it suggests that prior findings of 
individuals' sensitivity to information credibility, especially when the information is preference- 
inconsistent, represent a boundary condition (e.g., Lowin 1969; Frey 1981; Ditto et al. 1998). 
These psychology studies provide experimental participants with information that varies at ex- 
treme levels of credibility. The capital markets information environment, on the other hand, offers 
investors a vast amount of information from sources representing a wide range of credibility. My 
findings suggest that investors looking to support a previous investment decision are willing to 
forgo a certain level of source credibility to find supportive information. However, at the point that 
the variance in two sources' credibility becomes too extreme, investors will likely choose to 
acquire information from the more credible source. 
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APPENDIX 


Thayer 


INVESTMENT DECISION 


After conducting more than 40 hours of research on the pharmaceutical industry, you 
concluded that two firms in the industry are undervalued [overvalued]. Based on your 
research you would like to take a long position [short position] іп опе of these two 


stocks. 


Note: A long [short] investment position is like taking a bet that the stock price will 


increase [decrease]. 


Upon comparing the two firms, differences between the firms have been narrowed down 


to a list of factors. These factors, grouped by firm, are as follows: 


Last thre2 years' growth rates in revenue: 
2006 2005 2004 
Sales volume 4.80% 5.40% 3.10% 
pricing change 1.5096 0.60% 1.00% 
currency exchange rate 0.50% 0.70% 3.40% 
total growth 6.80% 6.70% 7.50% 


pricing change 








Last three years' growth rates in revenue: 
2006 2005 2004 
sales volume 6.30% 700% 360% 
1,10% 0.60% 050% 
currency exchange rate 0.60% 0.50% 4,10% 
total growth 7,80% 8.10% 830% 





Operating Profit Margin last three years: 
2006 2005 
21% 18% 


Operating Profit Margin last three years: 
2006 2005 
2596 27% 














R&D expense as a % of Total Revenue: 12% 


Resignation of CEO, William Lawson, earlier this year 


Joint ventures with faboratories іп UK, France, Germany, 
india & China, А licensing agreement is also held with 
one overseas laboratory. 


increased competition from generic brands in European 
market 








R&D expense as a % of Total Revenue: 10% 
New independent auditor, Ernst & Young LLP; 
Previous auditor, PricewaterhouseCoopers 


Manufacturing facilities located in four Western 
European countries and one South American country. 


2 patent expirations expected in next 18 months; FDA to 
review application for short extension of patent 


protection 


i 


АП other factors you would consider in making this investment were similar across the 


two firms, including the prices of the respective stocks. 


It is important to remember that part of your payment is based on the performance of the 
investment position you choose. Please choose the firm in which you would like to take 


a long [short] investment position. 


INVESTMENT POSITION: HIGHER [LOWER] EARNINGS MEAN MORE POINTS 
You chose a long [short] position in Firm X. The price of Firm X stock at the time of your 


investment was approximately $66 per share. Using the firm's forward 


price-to-earnings (P/E) 


ratio of 17.55 [17.19], the implied EPS for the upcoming fiscal year end is $3.76 [$3.84]. There- 


fore, your investment benchmark is $3.76 [$3.84]. 


An additional 2 points will be added to your total for every penny by which actual annual EPS 


is above [below] $3.76 [$3.84], up to 25 points. 


Your total points will be reduced by 2 points for every penny by which actual annual EPS is 


below [above] $3.76 [$3.84], up to 25 points. 
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If the firm's actual EPS is $3.76 [$3.84], your will neither earn nor lose any points. 
In short, higher [lower] earnings leave you with more points. The higher [lower] the firm's 
relative performance, the better off you are. 
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ABSTRACT: This study investigates the spread of aggressive corporate tax reporting 
by modeling a firm's decision to adopt the corporate-owned life insurance (COLI) shel- 
ter. Prior studies identify firm characteristics associated with aggressive tax reporting 
(Desai and Dharmapala 2006; Frank et al. 2009) and tax shelter participation (Wilson 
2009; Lisowsky 2010). This study examines whether social environment factors explain 
the pattern of tax shelter adoption. Building on theory related to the diffusion of inno- 
vations and institutional isomorphism, І hypothesize direct and indirect ties between 
prior and potential shelter adopters influence the spread of shelter use. | find that 
network ties via board interlocks increase the likelihood of adopting the COLI shelter. 1 
also find weak evidence that COLI use spreads geographically. However, I find no 
evidence that the spread of COLI use is concentrated among a particular set of audit 
firms or industries. 


Keywords: {ах shelters; tax aggressiveness; corporate reporting; diffusion; institu- 
tional isomorphism; board interlocks. 


Data Availability: A// data are publicly available from sources identified in the text. 


I. INTRODUCTION 

uring the 1990s, the corporate tax shelter industry boomed, drawing coverage from the 
D financial press and a full-scale crackdown by the U.S. Treasury. In 1998, Forbes magazine 

described a new breed of tax shelters in its cover story, “The Hustling of X-Rated Shel- 
ters” (Novack and Saunders 1998). The following year, the U.S. Department of the Treasury 
(1999) released a 164-page report urging the adoption of numerous legislative measures to curb 
the growing tax shelter problem. However, to date we have limited evidence on the factors that 
affect a firm's decision to adopt a tax shelter (Wilson 2009; Lisowsky 2010). 
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Like the Enron-era wave of financial accounting scandals, the most recent corporate tax 
shelter boom alarmed regulators because it signaled a shift in the corporate norm toward aggres- 
sive reporting. In outlining its concern over the proliferation of tax shelters, the U.S. Treasury 
points to the following quote from the New York Bar Association: 

Тһе constant promotion of these frequently artificial transactions breeds significant disrespect for 

the tax system, encouraging responsible corporate taxpayers to expect this type of activity to be the 

norm, and to follow the lead of other taxpayers who have engaged in tax advantaged transactions. 

U.S. Treasury (1999, 3) 
This quote motivates the question, *How does a corporate practice such as the use of tax shelters 
spread?” and suggests that later shelter adopters imitate early shelter adopters because they see tax 
shelter use as the new norm. I investigate the spread of aggressive corporate tax reporting by 
examining investments in a specific tax shelter: corporate-owned life insurance (COLI).! Drawing 
on theories from the innovation-diffusion literature, I test whether factors related to a firm's social 
environment help to explain tax shelter participation. 

Rogers (2003, 11) defines diffusion as “the process by which (1) ап innovation (2) is com- 
municated throvgh certain channels (3) over time (4) among the members of a social system" 
(emphasis in the original). Because firms' strategic choices are similar to innovations, researchers 
have looked to innovation-diffusion theory for insights on how and why certain firm practices 
spread (such as poison pills [Davis 1991], investor relations departmenis [Rao and Sivakumar 
1999], multi-divisional form [Palmer et al. 1993], golden parachutes [Davis and Greve 1997], 
options back-dating [Bizjak et al. 2009], and private equity targets [Stuart and Yim 2010]). The 
decision to adopt an innovation is heavily influenced by the decision-maker’s broader social 
environment (Rogers 2003). Before adopting an innovation, decision-makers must have knowl- 
edge of the new strategy—knowledge often gained through direct contacts between peers or 
indirect contacts with other agents of change within the social environment. Once aware of the 
innovation, potential adopters seek to reduce the inherent uncertainty that surrounds the imple- 
mentation of novel ideas by looking to the experiences of their peers. 

The importance of obtaining information on new strategies vicariously by observing prior 
adopters links ciffusion theory with a separate, but related field of study: institutional isomor- 
phism. Instituticnal isomorphism is concerned with the homogenization df organizational behav- 
ior, when, why and how firms mimic one another. When faced with uncertainty, organizations 
economize on search costs and imitate the behavior of other organizations (Cyert and March 
1963). Mimetic, or imitative, organizational change can be thought of as a contagion process that 
spreads fashionable practices from one firm to another (Haveman 1993). Theory on the diffusion 
of innovations and institutional isomorphism suggests that firms imitate their peers in the adoption 
of new strategies (for example, see Haveman 1993; Haunschild 1993; Rao and Sivakumar 1999). 
I extend the application of innovation-diffusion theory to examine the spread of aggressive tax 
reporting, focusing on the innovative COLI shelter developed in response to the Tax Reform Act 
of 1986 (TRA86). 

First, I analyze the pattern and timing of COLI shelter adoptions. I find that firms’ adoption 
decisions appear to be influenced by prior COLI adopters and by the possibility of changes to the 
tax law. Then, using a matched control sample design, I estimate a cross-sectional logit model of 
tax shelter adoption that includes four social environment factors suggested by prior studies on the 
diffusion of strategic corporate practices: (1) direct contacts between prior and potential adopters 


1 Т describe the development of the corporate-owned life insurance shelter in Section П. Unless otherwise indicated by 
context, I use the term “COLT” to refer specifically to the COLI shelter, not corporate-owned life insurance in general. 
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(proxied by board ties); (2) indirect ties through shared professionals (proxied by auditor ties); and 
imitation of (3) structurally equivalent (industry) and (4) geographically proximate peers. 

I find that direct ties with prior COLI adopters via board interlocks increase the probability 
that a firm adopts COLI. I also find some evidence that COLI adoptions spread geographically. 
However, although auditors were targeted in the public press for their role in the tax shelter boom, 
I find no evidence that COLI adoptions spread via shared auditors. Similarly, І find no evidence 
that COLI adoptions spread along industry lines. Finally, I test the robustness of these findings 
using event history analysis and a hazard model that captures both the likelihood of adoption and 
the rate at which a firm adopts, and find similar results with respect to board interlocks. 

This study contributes to recent literature on aggressive corporate tax reporting by examining 
the spread of a specific tax shelter, COLI. Prior studies have uncovered various firm characteristics 
associated with aggressive corporate tax reporting (Desai and Dharmapala 2006; Frank et al. 2009) 
and tax shelter use (Wilson 2009; Lisowsky 2010). Although the results of these studies help to 
explain cross-sectional variation in aggressive reporting behavior and help researchers identify 
which firms were likely engaged in aggressive behavior, they do not offer much insight into the 
wave of aggressive reporting that prompted the U.S. Treasury's report in 1999. Firm behavior is 
clearly related to firm characteristics, but it is also embedded in larger social structures and 
culture. This study examines the impact of those social structures and thus contributes to an 
understanding of the macro-level behavior that took place during the corporate tax shelter boom. 

Section II details the development and mechanics of the COLI tax shelter and the identifica- 
tion of COLI adopters examined in this study. Section III discusses COLI in the context of 
innovation-diffusion theory. Section IV develops a cross-sectional model of COLI adoption and 
discusses the estimation results. Section V describes the event history analysis and presents the 
hazard model estimation results. Section VI concludes. 


П. BROAD-BASED LEVERAGED COLI 
Development of the COLI Shelter 

The COLI shelter can be described best as a tax arbitrage transaction: the taxpayer finances 
the purchase of an asset that produces tax-exempt income, a cash value life insurance policy, using 
proceeds from a loan that produces tax-deductible interest expense.” The cash value policy in- 
cludes a death benefit and a savings component, both of which are accorded preferential tax 
treatment. Unlike other interest income, which is taxed as earned, the interest credited to the cash 
value of a policy (the inside buildup) is excluded from gross income (IRC 872).? Moreover, a 
taxpayer can access the cash value of his/her policy while still preserving the deferral benefit by 
borrowing against the policy because loans secured by life insurance contracts are not treated as 
taxable distributions (IRC §72). 

Over time, Congress has tried to limit life-insurance-related tax arbitrage by narrowing the 
statutory definition of life insurance. Concerned about the excessive amount of "key-man" life 
insurance being purchased by firms, Congress limited interest deductions on COLI borrowings by 
capping qualified policy loans at $50,000 per insured life (TRA86). Instead of abandoning the 
market for tax arbitrage after TRA86, insurance industry entrepreneurs responded by offering a 


Since payouts upon the death of the insured are generally tax-exempt to the beneficiary, the premium expense related to 
the production of that income is not deductible, hence the need for the arbitrage transaction. 

Distributions of the cash surrender value are generally treated first as a tax-free recovery of basis, and only result in 
includible income when the amounts distributed exceed the taxpayer's investment or basis in the policy (IRC §72(e)). 
Thus, even taxpayers who withdraw the cash surrender value of a policy can still enjoy a significant deferral benefit. 
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new product: broad-based, leveraged COLI (BBCOLI). BBCOLI uses volurhe, covering thousands 
of a company’s rank-and-file а to make up for the arbitrage opportunity denied by the 
$50,000 per insured policy loan сар. | 

BBCOLI provides a good setting for studying the spread of aggressive corporate tax report- 
ing. Although defining exactly what constitutes aggressive corporate tax reporting is challenging, 
ВВСОШ is generally classified by accounting researchers and legal scholars as a corporate tax 
shelter? Furthermore, details from the Camelot case, In re СМ Holdings, Inc. (254 B.R. 578, D. 
Del. 2000), reveal exactly when this innovative tax shelter was developed, a critical element in 
studying the diffusion of a practice. Last, BBCOLI is a good setting because changes to the law in 
1996, governmert court victories, and the implementation of an IRS amnesty/settlement program 
effectively shut down the COLI shelter, шала COLI investors to unwind their investments and 
bringing COLI investors out of hiding. Thus, COLI represents a chance to identify and study 
firms that invest in tax shelters, both those that were caught and those that voluntarily came 
forward. | 


Identifying COLI Shelter Participants 


I use Lexis-Nexis to search 10-Ks, news articles, and press releases for evidence of COLI 
activity using the following string: (company or corporate) w/5 owned w/5 life w/5 insurance w/10 
(tax or taxes). This search captures two main groups of firms: (1) those for which involvement in 
the COLI shelter is revealed ex post because of an IRS dispute, IRS amnesty settlement, or other 
legal dispute, and (2) those that openly disclose their COLI activity in their financial statements 
during active COLI shelter years.’ Owens & Minor is an example of the first group; footnote 
disclosures in 2000 and 2001 discuss the IRS Notice the firm received regarding COLI and the 
reserve charge booked in light of the IRS's COLI court victories. Marriott is an example of the 
second group, as shown in the following excerpts from the firm's 1995 annual report. The top 
excerpt comes from Management's Discussion and Analysis (MD&A), and the bottom excerpt is 
the reconciliation of the firm's effective tax rate (ETR) to the statutory tax rate provided in the 
notes to the financial statements. 


The Company's effective income tax rate declined to 40.0 percent from 41.4 percent in the pre- 
ceding year, despite expiration of federal jobs tax credit programs, due to the impact of the Com- 
pany's corporate-owned life insurance program and certain other investments. 


Winn-Dixie covered nearly all 36,000 of its employees while Wal-Mart allegedly covered 350,000 of its employees 
(Winn-Dixie Storez, 113 T. C. 54 (1999); Rice v. Wal-Mart Stores, Inc. 12 F. Supp. 2d 1207). 
5 СОШ із cited as a classic example of the recent breed of corporate tax shelters (Bankman 2004; Eustice 2002; Gergen 
2002) and is included in analyses by Graham and Tucker (2006), Wilson (2009), and Lisowsky (2010). 
In 1996, Congress further restricted COLI arbitrage, limiting a corporation's interest deductions to those on policy loans 
for a maximum 02 20 individuals (Health Insurance Portability and Accountability Act, HIPAA 1996). The IRS suc- 
cessfully litigated three COLI cases, Winn-Dixie (113 T. C. 254), C.M. Holdings (254 B.R. 578), and AEP (136 F. Supp 
24 762), and implemented an amnesty/settlement program for COLI transactions. 
Tax shelters are usually characterized by secrecy and using a sample of firms that openly disclose their zctivities in their 
financial statements is likely to call into question whether these firms truly represent typical tax shelter users. However, 
Bankman (2002) suggests that firms that think their shelter activities would need a cloak are Jess likely to engage in tax 
shelters, while those that decide to adopt tax shelters have convinced themselves that their shelter activity meets a 
littoral reading of the tax law. 
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1995 1994 1993 


U.S. statutory tax rate 35.0% 35.0% 35.0% 
State income taxes, net of U.S. tax benefit 5.0 5.0 6.0 
Corporate-owned life insurance (1.2) — -- 
Tax credits (1.4) (0.7) (0.8) 
Other, Net 2.6 22 2.0 


40.096 41.5% 42.2% 


My initial text-string search results in 213 possible COLI adopters. I exclude 28 life insurance 
companies because these firms serve as the other party in the COLI tax shelter transaction. 
Bank-owned life insurance, BOLI, is similar to COLI, but since the legislative history surrounding 
limitations on BOLI is different, I exclude 32 banks from the sample of COLI shelter participants. 
After excluding banks and insurance firms, 153 potential COLI shelter participants remain. 

Firms often purchase COLI for legitimate business purposes, including insuring against the 
loss of key executives and providing a perk for outside directors. To ensure that I only include 
firms that participate in the tax shelter variety of COLI, I examine the remaining 153 firms in 
detail to confirm participation in BBCOLI. I assume that firms whose COLI programs give rise to 
tax benefits material enough to warrant a line-item on the tax rate reconciliation are involved in 
ВВСО and include these firms in the shelter sample. I identify additional COLI shelter partici- 
pants based on news articles, court cases, and financial statement details regarding COLI-related 
IRS disputes or settlements. After examining the 153 firms from my initial search in detail, I 
identify 43 firms as COLI shelter participants.” 


Estimate of COLI Tax Savings 


Table 1 provides various estimates of the magnitude of COLI tax savings for the sample firms. 
Twenty-seven (62.8 percent) of the sample firms provide detailed information on their COLI tax 
savings through their tax footnote rate reconciliations. Using these details, I report estimates of the 
average annual and total tax savings under the columns labeled “Rate Reconciliation." On 
average, over the duration of COLI use, firms report saving $26 million dollars in taxes, equivalent 
to 2.9 percent of pre-tax book income. 

For firms that disclose the dollar amount of their COLI-related IRS settlement, I report the 
amount and year of settlement under the columns headed, “Settlement.” The mean (median) 
settlement amount is $50.8 million ($38 million), nearly twice the estimated amount of tax savings 
reported in firms’ rate reconciliations. The last two columns report estimated COLI savings de- 
rived from a variety of other sources, including court cases and footnote disclosures of COLI- 
related reserve charges. The mean COLI tax savings estimated from these sources is $46.8 million. 
In 2000, the staff of the Joint Committee on Taxation estimated that there were 100 COLI cases 
involving nearly $6 billion dollars in taxes (Paull 2000). Multiplying the highest average tax 
savings estimate, $50.8 million, times the 43 firms in the sample yields an aggregate tax savings 
estimate of more than $2’ billion, suggesting my sample accounts for roughly one-third of the 
COLI tax shelter market. 


Neither the economic sham arguments used in court to defeat leveraged COLI nor the 1996 HIPAA restrictions on 
leveraged COLI directly apply to BOLI because banks generally do not borrow against the policies to fund the premium 
payments. 

My sample includes all firms whose shelter activity is observable, Tax shelter activity is observable if either (1) the firm 
is a disclosing firm, (2) the firm is a non-disclosing firm whose activity is revealed ex post when detected by regulators, 
or (3) the firm is a non-disclosing firm whose activity is revealed ex post when the firm comes forward for the COLI 
amnesty program. Tax shelter activity is generally not observable for non-disclosing firms whose activities are not 
detected by regulators. 

10 Because ће dollar-effect of rate reconciliation line items is harder to interpret in years when pre-tax book income is 
negative, I only include COLI tax savings amounts for years with positive pre-tax book income. 
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III. THE DIFFUSION OF COLI 
Pattern of COLI Adoption 


I examine the pattern of COLI adoptions over time. There are three basic models of diffusion: 
the internal influence model, the external influence model, and the mixed influence model, which 
allows for both internal and external influences (Mahajan and Peterson 1985).!! Imitation, sug- 
gested by the U.S. Treasury (1999) as a factor in the spread of shelter use, is characteristic of the 
internal influence model, which predicts that the rate at which an innovation is adopted is a 
function of the cumulative number of prior adopters: 


ам) 


2i zx(t) = fn(N(G-1)), (1) 


where N(f) is the cumulative number of adopters at time f, and х(2) is the number of new adopters 


at time г. Under the internal-influence model, the rate of adoption, Now plotted against time 


produces a bell-shaped curve and the cumulative diffusion curve is S-shaped (Mahajan and Peter- 
son 1985; Rogers 2003). 

In contrast, external models assume no relationship between х( and N(1—-1)—adoption is 
driven solely by an external source like mass media or a change agent. Under the external- 
influence model, the cumulative number of adopters increases over time, but at a (constant) 
decreasing rate (Mahajan and Peterson 1985). COLI was developed by an insurance entrepreneur 
and marketed by insurance brokers, and these tax shelter promoters, acting as external agents of 
change, obviously had an impact on the spread of COLI use. Although court documents reveal 
details of the relationship between shelter promoters and a few COLI adopters, the extent of 
contact between tax shelter promoters and all potential COLI adopters is unobservable. Therefore, 
I generally assume that the influence of external change agents, in the form of tax shelter promot- 
ers, is constant across the sample of potential COLI adopters examined in this study.? 

I examine the pattern of COLI adoption by using data from firms’ rate reconciliations and 
disclosures regarding IRS disputes and settlements to estimate 1, the year of COLI adoption." As 
Figure 1 shows, the cumulative number of COLI adopters generally increases from 1986 to 1990, 
levels off in 1991 and 1992, and increases again in 1993. The increases in new adoptions in 1990 
and 1993 can be explained by the legislative history of anti-COLI measures. By 1990, Congress 
and the U.S. Treasury were aware of COLI abuses (U.S. Department of the Treasury 1990), and in 
1991 anti-COLI legislation was proposed in the Senate. Fearing new restrictions on COLI use, but 
expecting that any existing COLI programs would be grandfathered, firms that were considering 
adoption in early 1990 rushed to complete their COLI deals (Gergen 2002). After Congress failed 
to enact the proposed restrictions in 1991 and 1992, the rate of COLI adoptions surged іп 1993. 

While Figure 1 indicates that the rate of COLI adoptions over time is likely related to the 
history of proposed legislative measures to curb COLI use, the observed rate of COLI adoptions 


н Importantly, these models assume that an innovation is legal and do not consider the role of ап external regulatory body 
with the power to curb or halt the rate of adoption. Gergen (2002) analytically models how the risk of detection can 
incentivize tax shelter promoters and tax shelter users to limit the volume of a shelter's distribution. However, this study 
is the first to empirically model the spread of shelter use. 

12 Shelter promoters could have systematically targeted firms with the same firm characteristics I model as predictors of 
adoption, giving rise to an endogeneity problem. However, a systematic bias in the type of firm targeted by shelter 
promoters should bias against findings related to the social environment factors included in the model. 

13 Since managers can exercise considerable discretion in the amount of detail disclosed in the firm's rate reconciliation, 
using rate reconciliations may lead to mismeasurement of t, but any mismeasurement should bias against finding results. 
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FIGURE 1 
Observed Pattern of Cumulative COLI Adoptions 
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Figure 1 presents the pattern of cumulative COLI adoptions, N(f), observed in the COLI sample. 


does not map clearly into the pattern predicted by either a pure internal or pure external source 
diffusion model. Below, I present four possible mechanisms of diffusion and test whether each 


helps to explain the spread of COLI use. 
f 


. Sources of Internal Influence 


Innovation-diffusion is linked to mimetic isomorphism, the tendency of firms to imitate one 
another, because, in an attempt to gain information and reduce the inherent uncertainty surround- 
ing an innovation, potential adopters rely heavily on the experiences of prior adopters. The im- 
portance of evaluating an innovation vicariously though the trials of prior adopters is consistent 
with anecdotes from shelter participants reported by Bankman (1999, 1781): “Most companies аге 
reluctant to be the first purchasers, preferring to purchase a shelter that has been vetted by others." 
Under frequencv-based imitation, as more and more firms adopt a practice, the practice gains 
legitimacy and eventually becomes taken for granted (March 1981). 

Outcome-imitation theorizes that firms adopt strategies when they otiserve improvements in 
the performance of peer firms using those strategies (Haunschild and Miner 1997). Abrahamson 
and Rosenkopf (1993) posit that pressures on firms arising from the threat of lost competitive 
advantage produce competitive bandwagons, even when the returns to an innovation are unclear. 
As the number of adopters in a group increases, non-adopters face the risk of falling farther below 
average if the innovation succeeds. Outcome-imitation is consistent with the U.S. Treasury's 
(1999) report, which identifies pressure to keep ETRs low and in line with competitors as a driving 
force behind the tax shelter boom. Both frequency-based imitation and outcome-based imitation 
imply that the prevalence of a practice within a community will impact the rate of adoption. 
Cohesion | 

Cohesion models of imitation focus on personal contacts between prior adopters and potential 
adopters. In order to adopt an innovative practice, potential adopters must be aware of the inno- 
vation's existence and be persuaded to adopt it. A potential adopter can larn about a tax shelter 
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through its network contacts. More importantly, network contacts with prior adopters can help 
potential adopters evaluate the benefits and risks associated with a shelter, and therefore impact the 
persuasion stage of the innovation-diffusion process. Unfortunately, network contacts are often 
difficult to identify and measure. Tax shelter participants can be connected through common 
professional association memberships, common board members, common legal advisors, common 
tax consultants, common auditors, or others. 

Board interlocks have been shown to affect the diffusion of a variety of corporate practices 
(e.g., Davis 1991; Mizruchi 1992; Haunschild 1993; Palmer et al. 1993; Bizjak et al. 2009). The 
estimates provided in Table 1 suggest that COLI tax savings were significant, and accounts of the 
negotiations between insurance brokers and firms surrounding the initiation of COLI programs 
document top management's involvement in the decision to adopt COLI. As such, it is reason- 
able to assume that board members were aware of COLI and could have influenced the rate of 
COLI adoptions through their ties to other boards. Even absent this direct type of influence, 
board interlocks proxy as a general indicator that two firms are socially connected (Strang and 
Soule 1998). Therefore, I hypothesize: 


НІ: Ties to prior COLI adopters via board interlocks will increase the likelihood that a firm 
adopts the COLI shelter. 


Professionals, like lawyers, consultants, and accountants, can act as external change agents, 
influencing the adoption of a new corporate practice by bringing the innovation to their clients. For 
example, court documents indicate that AEP initially considered COLI at the suggestion of their 
accountants, Deloitte, Haskins, and Sells (АЕР, Inc. у. U.S., 136 Е Supp. 2d 762). Professionals 
can also play a role in firms’ mimetic behavior, acting as the indirect tie connecting prior and 
potential adopters (DiMaggio and Powell 1983). Haunschild (1994) finds that professional firms 
act as a conduit for the spread of acquisition premium information. The uncertainty engendered 
with considering an activity of questionable legitimacy, ПКе tax sheltering, can make firms more 
likely to rely on information gathered through professional firms, if only to legitimate and ratio- 
nalize the decision to engage in the activity (Pfeffer 1981). 

Regulators have focused on the role of public accountants in the spread of the tax shelter 
industry, criticizing that when a firm advises its audit client on a tax shelter transaction, the firm 
essentially audits its own work (U.S. Senate Permanent Subcommittee on Investigations 2005). 
Because regulators investigating the tax shelter industry have placed a spotlight on auditors and 
because auditors are a natural information conduit between prior and potential shelter adopters, I 
hypothesize: у 


H2: Ties to prior COLI adopters via shared auditors will increase the likelihood that а firm 
adopts the COLI shelter. 


Structural Equivalence 


In contrast to cohesion models of imitation, structural equivalence models suggest that firms 
imitate other organizations that have similar relationships with their environment, even in the 
absence of direct contacts (Galaskiewicz and Burt 1991). Structurally equivalent firms show 
similar adoption patterns because, in an attempt to maintain their competitive positions, firms 


14 See AEP, 136 F.Supp. 2d 762; In re CM Holdings, Inc., 254 B.R. 578; Winn-Dixie, 113 T. C. 254 (1999); Dow Chemical, 
250 F.Supp. 2d 748. 

15 AEP's CFO consulted with the Chairman and CEO of the firm before adopting the COLI program and appeared before 
the board of directors to explain program once it had been adopted (АЕР, Inc. v. United States, 136 F.Supp. 2d 762). 
When evaluating whether to proceed with a broad-based, leveraged COLI plan, Camelot consulted with Ernst & Young 
(In re CM Holdings, Inc., 254 B.R. 578). 
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imitate their successful peers (Burt 1987). Anecdotal evidence suggests that firms face significant 
pressure to imitate the tax savings strategies of their peers (Novack and Saunders 1998). Table 1 
identifies 27 firms that recognized COLI tax savings as a rate reconciliation 1 line-item in the 10-K 
tax footnotes, suggesting that even though tax shelters are typically characterized by secrecy, 
knowledge of COLI and its effect on ETR were reasonably public knowledge. Therefore, using 
industry membership as a proxy for structural equivalence (Fligstein 1985, 1990), I hypothesize: 


H3: The presence of prior COLI shelter adopters in a firm's industry jwill increase the like- 
lihood that a firm adopts the COLI shelter. 


Spatial Proximity 

Diffusion studies commonly find that spatially proximate actors influezice each other (Rogers 
2003). For example, Davis and Greve (1997) find that golden parachutes diffused via local busi- 
ness communities, and Burns and Wholey (1993) find regional differences in the adoption of 
matrix management." Shared membership in a geographic region affords a high level of inter-firm 
interaction, resulting in the adoption by one firm in an area spurring others to do the same (Strang 
and Soule 1998). Using regional classifications developed by the Department of Commerce’s 
Bureau of Economic Analysis (BEA region), I hypothesize: 


Нада: The presence of prior COLI adopters in a firm's BEA region willl increase the likeli- 
hood that a firm adopts the COLI shelter. 


Regional differences in diffusion rates can also result from the geographic clustering of 
innovators. Local business elites are connected through a range of formal and informal institutions 
that facilitate communication, from the country club to local charity organizations. Consequently, 
those locales closest to an innovation's origin are likely sources of an innoyation’s earliest adopt- 
ers (Rogers 2003). Three COLI court cases reveal the involvement of various insurance companies 
based in New York and New Jersey.? Therefore, І assume that the Mid-East BEA region is the 
center of COLI innovation and hypothesize: 


H4b: The greater the geographic distance between the firm's headquarters and the Mid-East 
BEA region, the less likely the firm is to adopt the COLI shelter. 


IV. MODELING COLI ADOPTION 

Control Sample i 

To test whether imitation contributes to the spread of shelter use, I model a firm's decision to 
adopt COLI using a matched-control design. The foremost factors affecting whether a firm adopts 
or does not adopt a particular tax shelter are the presence of taxable income to shelter and 
compatibility between the technology of the shelter and the firm's organizational structure and 
asset and income тіх.! The technology of COLI is based on maximizi-ig the tax preferences 
related to insurance by investing in contracts on a large number of employees. To match COLI 
adopters to peer firms that could have benefited from adopting COLI but chose not to adopt, I form 


17 A parallel stream о: literature in political science examines the regional diffusion of state peilicy innovations. See Berry 
and Berry (1990) for a review of this literature. 

18 Mutual Benefit (MBL) of New Jersey underwrote the policies purchased by Camelot Music and AEP, which was also 
‚о арргоасвед with a deal funded by New York Life. АТО of New York underwrote the policies purchased by Winn-Dixie. 
19 Industry is one possible way to control for differences in shelter use arising from differences in the match between firms’ 
operations and the zechnology of a specific shelter; however, to test whether COLI use spread along industry lines, I do 
not match COLI and control firms based on industry. 
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a control sample based on profitability and workforce size because these two firm characteristics 
are likely prerequisites for COLI adoption. First, 1 rank all Compustat firms into 20 groups based 
on КОА and InEMP separately, where ROA is pre-tax book income minus minority interest 
divided by total assets ((PI - МШУ/АТ), and мЕМР is the log of the number of employees 
(EMP).”° Then, I identify ће ROA and InEMP ranks of each COLI firm in year /-1 and include 
any firms with matching ROA and InEMP ranks in that year in the control sample. The resulting 
one-to-many control sample includes 41 COLI firms and 330 control firms?! 

To make the collection of board interlock data manageable, I develop a one-to-one control 
sample from the one-to-many sample and use this for my primary tests. For the one-to-one control 
sample, I choose the control firm with proxy statement data available on Lexis-Nexis that most 
closely matches each COLI firm's ROA and /nEMP іп 1-1. Given the difficulty in identifying 
shelter users, my control sample likely includes firms that used COLI, but whose sheltering 
activities have not been detected or disclosed.” 


Cross-Sectional Model of Firms’ Decisions to Adopt COLI 


Initially, I employ the following cross-sectional logistic model to test whether social environ- 
ment factors affect the likelihood of COLI adoption (Н1-Н4): 


Log(P(t)/ [1— Р(2)])) = Bo + ВЕОКЕСМ, а + КО + B3INDRANK;, + BASGRWTH;, 
 BsETR;,., + BgBODLINKI;,  BjAUDITLINKI;,  ByINDLINK];, 
+ ByBEALINKI;, + ByoDISTANCE; + в, (2) 
where P(t) is the probability of adopting the COLI shelter. 


Independent Variables 


I test НІ by including BODLINKI, an indicator variable equal to 1 if the firm shared a board 
member with a COLI adopter at anytime between 1986 and 1995. I hand-collect information on 
each COLI and control firm's board of director membership from proxy statements. I assign each 
director a unique identifier and determine all board interlocks among the combined set of COLI 
and control firms, Both COLI-to-COLI interlocks and COLI-to-control interlocks result in 
BODLINK] equal to 1, while control-to-control interlocks do not. 

BODLINK] captures network ties through board interlocks generally. Here, I assume that the 
social ties and shared knowledge generated through board interlocks are not limited to actual 
interlock years, but spillover to surrounding years. In a later model specification, I construct a 
more restrictive measure of the influence of board interlocks that accounts for both the timing of 
shared board members and the timing of interlocked firms' COLI adoptions. Hypothesis 1 predicts 
that COLI adoptions spread via interlocking directorates; therefore, I expect the coefficient on 
BODLINKI to be positive. 

I examine H2, the impact of indirect ties between prior and potential adopters via auditors, by 
including AUDITLINKI, an indicator variable equal to 1 if the firm's auditor in year t audited а 
prior COLI adopter. I obtain each firm's auditor during the year from Compustat's Company 


91 require that firms are incorporated in the U.S. and have total assets greater than $1 million. These requirements apply 
to references to “аП Compustat firms." 

21 Two of the original 43 COLI firms do not have sufficient data to construct all the explanatory variables included in the 
regression models. 
As a precaution, I exclude 19 firms that appear in my initial 213 COLI sample from the control sample even though I 
cannot confirm that these firms used broad-based, leveraged COLI. Inclusion of COLI firms in the control samples 
increases the noise in the data and would bias against finding results. 
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Auditor data set, data code AU. Consistent with H2, I expect a positive coefficient on AU- 
DITLINKI. 

To test НЗ, I include INDLINKI, an indicator variable equal to 1 if, at time г, a prior COLI 
adopter exists within the firm's industry. Industries are classified according to Barth et al. (1998). 
INDLINKI varies by firm and by year but does not vary across industry' within a given year. 
Hypothesis 3 predicts that firms imitate their structurally equivalent peers in order to stay com- 
petitive; thus, I expect a positive coefficient on INDLINKI. 

Similar to INDLINK1, I test H4a by including BEALINKI, an йс: variable equal to 1 if, 
at time і, a prior COLI adopter exists within the firm's BEA region. Hypothesis 4a posits that firms 
imitate their geographic neighbors; therefore, I expect a positive coefficient on BEALINKI. Hy- 
pothesis 4b predicts that shelter use will spread geographically starting from the source of the 
innovation. To test H4b, I construct DISTANCE, an ordinal variable equal ісі 1 for firms headquar- 
tered in the Mid-East BEA region; 2 for firms headquartered in the neighboring New England, 
Southeast, and Great Lakes BEA regions; 3 for firms headquartered in the Plains and Southwest 
BEA regions; and 4 for firms headquartered in the Rocky Mountain and Fat West BEA regions. I 
expect the coefficient on DISTANCE to be negative. 


Control Variables 


The main control variables in Equat'on (2), FOREIGN, RD, ETR, and SGRWTH, are drawn 
from two recent studies that explicitly model firm characteristics associated with tax shelter par- 
ticipation (Wilson 2009; Lisowsky 2010). Both studies focus on factors that can be used to infer 
shelter use rather than factors related to the decision to adopt a shelter, and: both studies compare 
non-shelter firms to a group of shelter firms, wherein the participants of several different types of 
tax shelters are combined together and treated as one class. One issue with. grouping participants 
from different types of shelters is that each shelter has its own technology, and only firms whose 
traits match the technology of a particular shelter can use it. By modeling a firm’s decision to 
adopt a single type of shelter, I hold differences in the technologies of various shelters constant. 
Consequently, I do not necessarily expect the same relationship between thé control variables and 
the decision to adopt the COLI shelter that Wilson (2009) and Lisowsky {2010) hypothesize for 
shelter participation in general. 9 i 

Wilson (2009) finds that large book-tax differences (BTDs) are a marker of shelter use, and 
Lisowsky (2010) finds that tax shelter use is positively related to lagged ETR. Shelters that 
produce permanent book-tax differences are the most sought after (U.S. Treasury 1999; Weisbach 
2002) as these reduce the firm's ETR and enable managers to simultaneously report low taxable 
income and high financial income. In terms of its financial statement effect, the COLI shelter is an 
archetype of recent corporate tax shelters. 24 T include the firm's prior year ETR, ЕТКЕ, 1, аз а 
control variable.” However, since firms can adopt COLI either to obtain a competitive ETR or to 
maintain an already low and competitive ETR when an alternative tax-savirigs strategy dries up, I 
make no directional prediction for the relationship between ЕТЕ, , and the likelihood of adopting 
COLI. 

Both Wilson (2009) and Lisowsky (2010) include ROA to capture profitability. Firms with 
greater profitability have more income to shelter. Since matching on ROA precludes including it as 
a control variable. I include growth in sales, SGRWTH, as an alternative measure of management’s 


23 Т omit several of the variables used by Wilson (2209) and Lisowsky (2010) because these variables capture the effect of 
tax shelter use rather than factors affecting the decision to adopt a shelter. 

24 See the Appendix for details on the financial statement reporting of COLI transactions. 

25 ETR is calculated as total income taxes (TXT) divided by pre-tax book income (PI). Following Gupta and Newberry 
(1997), I set ETR equal to 1 if TXT/PI is greater than 100 percent, and I set ЕТК equal to 0 when TXT/PI is negative. 
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incentive to shelter. I expect firms with higher sales growth to be more likely to adopt a shelter; 
therefore, I expect a positive relationship between SGRWTH and the likelihood of adopting COLI. 

Since several corporate tax shelters involve foreign operations and intangibles, Wilson (2009) 
and Lisowsky (2010) predict a positive relationship between measures of these firm characteristics 
and tax shelter participation. However, compared to other codified and shelter-related, tax-savings 
strategies involving specific types of assets, Jiabilities, income, and expenses, COLI can be used 
by a broad set of firms. Therefore, I include two variables that proxy for firms’ non-COLI tax 
savings opportunities, RD and FOREIGN, but J expect a negative, rather than a positive relation- 
ship between these firm characteristics and COLI adoption. To proxy for R&D activity, I include 
an indicator variable, RD, 1, equal to 1 if the ratio of R&D expense to sales in the prior year is 
positive, and 0 otherwise. I proxy for the extent of a firm's foreign operations with an indicator 
variable, FOREIGN, 1, equal to 1 if the ratio of the absolute value of a firm's foreign pretax 
income (PIFO) to the absolute value of its total pretax book income (PI — MII) is greater than 10 
percent. 

Finally, I include one additional variable suggested by diffusion theory, INDRANK. Larger 
organizations are generally more innovative (Rogers 2003). Furthermore, studies find that firms 
tend to imitate their successful and prestigious peers (Burns and Wholey 1993; Haveman 1993; 
Han 1994). To capture prestige, I rank all Compustat firms into 20 groups by year, industry, and 
total assets. I expect larger firms to adopt earlier and smaller firms to wait and adopt later, if at all; 
thus, I expect a positive coefficient on INDRANK. 


Pattern of Board Interlocks 


Figure 2 depicts all board interlocks within the combined COLI and one-to-one control 
sample during the time period 1986—1995. Of the 82 sample firms, 43 (52.4 percent) share a board 
member with another firm in the sample. COLI firms are shown as circles, including the year of 
COLI adoption, and control firms are shown as squares. Тһе proportion of COLI firms with board 
ties to another sample firm (either COLI or control) does not significantly differ from the propor- 
tion of control firms with board ties to another sample firm (58.5 percent versus 46.3 percent; z = 
1.11; p = 0.27). However, relative to control firms, COLI firms have significantly more interlocks 
to other COLI firms (48.78 percent versus 17.07 percent; z — 3.05; p « 0.01). The pattern of 
board interlocks depicted is consistent with H1, and suggests that COLI adopters are tied to one 
another via director contacts. 


Pattern of COLI Adoption by Auditor, Industry, and Region 


Table 2 shows the distribution of COLI adoptions across years, auditors, industries, and BEA 
regions. Notably, the first four COLI adopters in the sample each had different external auditors 
and, by 1990, COLI use had spread to clients of all of the former Big 6 auditors as well as BDO 
Seidman. The first four adopters are also in three different industries, and while over half of new 
COLI adoptions occur in the last three years, those adoptions only result in the spread of COLI use 
to two new industries. The pattern of COLI adoptions does not reveal an obvious concentration of 
early COLI adopters in any particular industry, auditor, or geographic region. 

Table 3 provides further detail on the pattern of COLI use by comparing the distribution of 
COLI versus control firms across auditors, industries, and BEA geographic regions. Panel А shows 
that the breakdown of firms by auditor is very similar for the COLI and one-to-one control 
samples. Furthermore, the Big 6 auditors audited more than 95 percent of each of the three 
samples: COLI, one-to-one control, and one-to-many control. Panel B indicates that the proportion 
of firms in the textiles and chemicals industries is higher among COLI firms than for either of the 
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FIGURE 2 
Diagram of Board Interlocks 
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Figure 2 depicts all board interlocks within the combined COLI and control sample (n = 82 firms) during the 
time period 1986-1995. The one-to-one control sample is constructed by matching COLI firms and control 
firms based on ROA and InEMP in the year prior to adoption (171) and proxy statemezt data availability. 
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TABLE 2 
Pattern of New COLI Adoptions across Time, Auditors, Industries, and BEA Regions 
лут с eer ано 5 (4) 
New COLI New Auditors New Industries New BEA 
Year adopters represented represented Regions represented 
1986 1 ] 1 1 
1987 2 2 1 2 
1988 1 1 1 0 
1989 5 2 3 2 
1990 7 1 2 0 
1991 1 0 1 0 
1992 2 0 0 1 
1993 П 1 2 1 
1994 9 0 0 0 
1995 2 0 0 0 
Total 41 8 11 7 


Sample includes 41 COLI adopters with data to construct all explanatory variables. Column (1) shows the number of new 
COLI adopters by year. Broad-based, leveraged COLI was developed in 1985 and 1986 in response to restrictions enacted 
іп TRA86 and was effectively shut down by further tax law changes in 1996 (HIPAA). Columns (2)-(4) show the number 
of new auditors, industries, and BEA regions represented by those adopters. Auditor is determined from Compustat data 
code AU. Industry classifications are based on Barth et al. (1998). BEA region classifications are obtained from the U.S. 
Department of Commerce's Bureau of Economic Analysis and assigned based оп Compustat data code STATE, 


two control samples, but the distribution of firms across industries is otherwise similar among the 
three groups. Contrary to H2 and H3, the descriptive data in Table 2 and Table 3 show no 
particular concentration of COLI adopters by auditor or industry. 

Panel C of Table 3 shows some difference between the COLI and control samples in the 
distribution of firms across geographic region, consistent with H4a and H4b. Notably, although 
10-12 percent of the control firms are headquartered in the Far West region, there are no COLI 
firms headquartered there. Also, relative to the one-to-one control sample, there are significantly 
more COLI firms headquartered in the Great Lakes region (x? = 3.36; p = 0.07). 


Descriptive Statistics 

Table 4 provides descriptive statistics for both the COLI and one-to-one control samples, 
measured in year /. There is no statistical difference between the COLI and control samples in 
mean lagged profitability (ROA, |) or mean lagged workforce size (InEMP, 1), suggesting an 
effective matching procedure. Furthermore, there is no statistical difference in the average size of 
the COLI and control firms. Inconsistent with my expectations, control firms have significantly 
higher sales growth (SGRWTH) relative to COLI firms (12.5 percent versus 7.6 percent; two-tailed 
p = 0.04). Given that the COLI and control firms have similar profitability, one explanation for the 
lower sales growth among COLI firms is that COLI firms, facing lower growth prospects, focus on 
maintaining profitability by reducing expenses, specifically reducing tax expense through the use 
of a tax shelter. 

Consistent with H1, tests for differences in proportions indicate that COLI-to-COLI board 
interlocks are significantly more common than COLI-to-control board interlocks. Contrary to H2, 
but consistent with the distribution of observations across auditors reported in Table 3, the pro- 
portion of COLI firms with links to a prior COLI adopter via shared auditor is actually lower than 
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the proportion of control firms with such links (80.49 percent versus 92.68 percent; z - - 1.62; 
two-tailed p — 0.11). There is no significant difference between the COLI and control sample in 
the proportion of firms with links to prior COLI adopters via industry or BEA region. 


Cross-Sectional Logit Results 


Table 5 reports results for estimating a series of nested cross-sectional logistic regressions 
constructed from Equation (2). All models have reasonable explanatory power with pseudo-R? 
values ranging from 10.45 percent to 16.33 percent.” Of the firm-specific characteristics, only 
FOREIGN, , is significant in all models. As expected, COLI adopters are less likely to have 
significant foreign operations relative to non-adopters. Wilson (2009) and Lisowsky (2010) gen- 
erally find that foreign income is positively associated with shelter use; however, their shelter 
samples group a variety of shelters together, including some that can only be used by firms with 
foreign assets or income. As such, the positive association they find between foreign operations 
and tax shelter use may reflect the technology match between certain shelters and certain firms 
rather than a determinant of tax shelter adoption generally. 

The results for RD, апа INDRANK are not in the predicted direction and vary across the 
models. Wilson (2009) also reports mixed results for the effect of R&D intensity on tax shelter 
use. INDRANK is based on the firm’s size (AT) relative to the other firms in its industry. Table 3 
and Table 4 show that COLI and control firms are distributed evenly across industries and are not 
statistically different in size. Furthermore, I match firms based on ROA and /nEMP, diminishing 
differences in INDRANK across the two groups. The coefficient on ЕТЕ, 1 is not significant in any 
of the models. In untabulated results, I substitute ЕТЕ, | with measures of the firm's lagged ETR 
adjusted for the lagged industry mean (median) ETR, and those measures are also not significant. 
Since I match firms based on profitability, which is a major determinant of ETR, there is likely 
little systematic variation in the ETR between the control and COLI firms." 

Consistent with H1, which predicts that board interlocks impact firms' COLI adoption deci- 
sions, the coefficient on BODLINK1 is positive and significant (1.02; z = 1.72; one-tailed р = 
0.04). Holding all indicator variables equal to 1 and all other variables at their median values, a 
change in BODLINK1 from 0 to 1 is associated with a 73.6 percent increase (from 33.7 percent to 
58.5 percent) in the probability that a firm adopts the COLI shelter. BODLINK] is a broad measure 
that generally proxies for the set of social ties between firms. In the event history analysis pre- 
sented below, I explore the association between board interlocks and the decision to adopt the 
COLI shelter further. 

Hypothesis 2 predicts that auditors impact firms' COLI adoption decisions. However, the 
coefficient on AUDITLINK] is negative rather than the predicted positive association; suggesting 
that prior and potential COLI adopters are not linked via shared auditors. Although the results in 
Table 5 cannot be generalized to other tax shelters, individual or corporate, COLI use does not 
appear to have spread through any particular public accounting firm. 

Hypothesis 3 predicts the likelihood of adopting COLI increases when there is a prior adopter 
ша firm's industry. A positive coefficient on INDLINKI would imply that firms imitate their 
structurally equivalent peers' tax sheltering activities. However, as Table 5 shows, the coefficient 
on INDLINKI is not statistically significant. One possible explanation is that industry does not 
accurately capture a firm's structurally equivalent peer group. Another is that, among potential 
diffusing mechanisms, the spread of COLI is better characterized by a direct-contact cohesion 


26 For the least restrictive model, including BODLINK1 and DISTANCE, the Hosmer and Lemeshow goodness-of-fit Y? 
equals 76.4, with a p-value for the null hypothesis equal to 0.40, indicating that the model is reasonably well fitted. 

27 Both COLI and control firms may be engaged in other, non-COLI aggressive tax savings strategies either before, during, 
or after the COLI adoption decision; these activities could also impact ETR. 
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model than by ar indirect structural equivalence model. Similar to H3, H4aipredicts the likelihood 
of adopting COLI increases when there is a prior adopter іп a firm's geographic region. The 
coefficient on BEALINK1 is positive, but not significant. 

Hypothesis 4b predicts that exposure to COLI through local social Е oris will be strongest 
in locales near the large insurance firms underwriting COLI policies; thus, firms with headquarters 
farther away from the Mid-East BEA region will be less likely to adopt the COLI shelter. Con- 
sistent with H4b, the coefficient on DISTANCE is negative and significant (-0.529; z 2—1.74; 
one-tailed p = 0.04). The last column in Panel B of Table 5 shows thai, BODLINKI and DIS- 
TANCE remain statistically significant when they are included in the model together, indicating 
that COLI adoptions are influenced by two complementary diffusing mechanisms. Holding all 
indicator variables equal to 1 and all other variables, including DISTANCE, jat their median values, 
a change in BODLINK1 from 0 to 1 is associated with a 84 percent increase (from 31.9 percent to 
58.7 percent) in the probability that a firm adopts the COLI shelter. Holding all indicator variables, 
including BODLiNK1, equal to 1 and all other variables at their median vzlues, а one-unit change 
in DISTANCE, from DISTANCE - 2 to DISTANCE - 3, is associated with à 24.5 percent decrease 
(from 58.7 percent to 44.3 percent) in the probability that a firm adopts the COLI shelter. 

The evidence in Table 5 suggests that COLI adoption spread among firms with shared board 
members and spread out geographically from the Mid-East region. Overall, results from using a 
. one-to-one matched control sample and estimating cross-sectional models based on Equation (2) 
are consistent with НІ and НАЂ, but are not consistent with H2, НЗ, and H4a. In untabulated tests, 
I use a one-to-many control sample to test H2, НЗ, H4a, and H4b. The coefficient on BEALINK1 
is positive and significant (0.79; z = 1.86; one-tailed p = 0.03), and ће codfficient on DISTANCE 
is negative and significant (—0.50; > = —2.33; one-tailed p = 0.01). The results using a one-to- 
many control sample are consistent with both H4a and H4b, but are not zonsistent with H2 and 
H3. | 


Model Specification 


The cross-sectional model presented above tests whether certain social environment factors 
affect the likelihood of adopting COLI at time 1, but does not test how those factors affect the 
timing of COLI adoptions. Below, I follow most diffusion studies and employ an event history or 
duration model. The variable of interest in а duration model is the length of time, Т, that elapses 
between the time a firm is first "at risk" of adopting a new practice and the time, 1, at which the 
firm adopts the practice: Duration data can be used to estimate a hazard rate, (2), the rate at which 
the duration ends in the interval [1 t--A], given that the duration has riot ended prior to the 
beginning of this interval: 


V. EVENT HISTORY ANALYSIS | 


>Тт=ат= | 
A(t) = lima, 0 Pera Pearse i (3) 
t 
I am interested in the hazard rate, the probability that a firm will adopt the COLI shelter at 
time 1, given that it has not previously adopted the COLI shelter. Although event history analysis 
often utilizes continuous-time hazard-rate models, the data in this study |are better suited to a 
discrete-time model because both the time of adoption and the related explanatory variables are 
observed at annual intervals only (Allison 1984). 281 construct a discrete-time event history model 
in which each firm contributes an observation for each year it is “at risk” of adopting a COLI 


i 


1 
28 Discrete-time hazard models have been used to examine the diffusion of a variety of corporate practices (e.g., Fligstein 
1985; Mezias 1990; Rao and Sivakumar 1999) as well as the adoption pattern of various state policies (е.р., Berry and 
| 
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shelter. Court documents indicate that the COLI shelter was developed in response to TRA86, and 
I identify at least one firm that appears to adopt COLI in 1986. Therefore, I assume that firms are 
“at-risk” if they have not adopted the COLI shelter and t is greater than 1985 and estimate a 
logistic regression of the following form: 


log(P())/[1— P()]) = № а( + È Бро) + È суху, (4) 


where P(t) is the probability of adopting the COLI shelter, b; is the set of coefficients for explana- 
tory variables x,(t) that do change with time, с, is the set of coefficients for explanatory variables 
x; that do not change over time, and a(t) represents separate constants for each period t. Specifi- 
cally, I estimate nested models derived from the following logistic regression: 


Log(P(t)/ [1 – P(t)])) = By + В,ЕОКЕІСМ, „у + ВКР; + BJINDRANK, + BySGRWTH,, 
+ BsETR; + Bg – Bayr indicators + B,;sBODLINK2,, 
+ ByAUDITLINK2, , + Bj;INDLINK2, + B,gBEALINK2,, 
+ BigDISTANCE, , + б. (5) 


The discrete hazard model specified in Equation (5) accommodates time-varying covariates by 
splitting the history of each sample firm into one-year records or spells, with all spells except the 
year of adoption coded as right-censored.? 

Duration models commonly assume that the baseline hazard rate is time-invariant and that the 
rate of adoption varies with firm characteristics, but is otherwise constant across time. However, as 
Figure 1 indicates, COLI adoptions appear to be related to firms' expectations about legislative 
measures to restrict the preferential tax treatment afforded to life insurance. Therefore, I include 
year indicators to determine how the hazard function changes over time. If the threat of potential 
adverse tax law changes prompted firms to complete COLI programs іп the hopes of a grandfa- 
thering provision, then the rate of COLI adoption should be higher in 1989 and 1990 relative to 
other years. If Congress' failure to pass legislation curbing COLI benefits in 1991 and 1992 
reduced the uncertainty surrounding COLI, then the rate of COLI adoption should be higher in 
1993 relative to other years. 


Independent Variables 


In the cross-sectional model, I test НІ using a broad measure of board interlock influence, 
BODLINK1, that does not vary across time. In the event history model, I employ a more restrictive 
measure of board interlock influence that accounts for the timing of board interlocks and COLI 
adoptions. BODLINK2 is equal to the number of board interlocks the sample firm has in year t 
with firms that have adopted COLI prior to year 1. Consistent with НІ, I expect a positive 
coefficient on BODLINK2. 

The cross-sectional model also employs dichotomous variables to test H2, H3, and H4a. In the 
event history model, I use three alternative measures to test these hypotheses: AUDITLINK2, 
INDLINK2, and BEALINK2. AUDITLINK2 is equal to the number of prior COLI adopters audited 
by the firm's auditor in year t, scaled by the total number of Compustat firms audited by the firm's 


Berry 1992, 1994; Mintrom and Vergari 1998). See Box-Steffensmeier and Jones (1997) for an in-depth discussion of 
the benefit of using discrete-time models for certain data and Omer and Shelley (2004) for an example of the use of a 
discrete-time specification in the accounting literature. 

?? COLI adopters contribute firm years in proportion to the timing of adoption; a firm that adopts in 1986 contributes one 
firm year, whereas a firm that adopts in 1995 contributes ten firm years, nine coded as 0 and one coded as 1. Non- 
adopters contribute ten firm years, the length of the window, all coded as 0. 
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auditor in year г. If COLI spreads via shared auditors (H2), then the rate of COLI adoption should 
be higher for firms whose auditors have prior contact with COLI, үн! in a positive coefficient 
for AUDITLINE2. 

INDLINK2 is equal to the number of prior COLI adopters in a firm's industry i in year 1, scaled 
by the total number of Compustat firms in the firm's industry in year | 4. If firms imitate their 
structurally equivalent peers captured by shared industry membership (H3), then the rate of COLI 
adoption should be higher for firms in industries with a greater frequency of prior COLI adopters, 
resulting in a positive coefficient for INDLINK2. Similarly, BEALINK2 equals the number of prior 
COLI adopters in the firm's BEA region in year f, scaled by the total number of Compustat firms 
in the firm's BEA region in year t. If firms imitate their geographically pro} ximate peers (H4a), then 
the coefficient on BEALINK2 will be positive. 

As in the cross-sectional model, I include DISTANCE, an ordinal variable that proxies for the 
geographic distance between a firm's headquarters and the center of early COLI activity. I expect 
firms headquartered within or close to the BEA's Mid-East region to adopt earlier relative to firms 
headquartered further away (H4b); therefore, I expect the rate of COLI adoption to be negatively 
associated with DISTANCE. 
Discrete-Time Hazard Logit Results 

Table 6 reports results from estimating the discrete-time hazard model in Equation (5).? АП 
models have reasonable explanatory power, with pseudo-R? values ranging from 15.62 percent to 
19.29 percent. The coefficients on the year indicators for 1989, 1990, 1993, and 1994 are positive 
and significant, indicating that the hazard rate, the likelihood of adopting COLI in year t given that 
the firm has not adopted COLI prior to year 1, is significantly greater in those years relative to the 
base year, 1986. This result is consistent with firms timing their adoptions! in response to concerns 
over legislative restrictions. As in the cross-sectional model, FOREIGN, | is the only control 
variable that is significant in the predicted direction across all models. | 

Consistent with НІ, the coefficient on BODLINK2 is positive and significant (1.67; z — 3.72; 
p < 0.01); the conditional likelihood of adopting COLI in year 2 is greater for firms with board ties 
to prior COLI adopters. In marginal effect terms, given that the firm lias not already adopted 
COLI, in 1993, a firm with one board tie to a prior COLI adopter is 2.4 times as likely to adopt 
COLI relative to а и with no board ties (increase in probability of adopting from 11.97 percent 
to 40.74 percent)?! 

Contrary to H2 and H3, but similar to the results in Table 5, the coefficients on AUDITLINK2 
and INDLINK2 in the discrete-time hazard model are not significant. While DISTANCE is signifi- 
cant in the cross-sectional models using a one-to-one control sample and both DISTANCE and 
BEALINK] are significant in the cross-sectional model using a one-to-miny control sample, nei- 
ther DISTANCE nor BEALINK2 is significant in the discrete-time hazard model. One interpretation 
of these mixed zesults is that DISTANCE affects the likelihood of adopting COLI but, contrary to 
expectation, DISTANCE is not related to the timing of adoptions.? 


9? Three COLI firms do not have data available to calculate the explanatory variables in all years from 1986 to 1996; thus, 
the sample used to estimate Equation (5) consists of 629 firm-year observations тергеп 38 COLI firms and 38 
matched control firms. 

3! The marginal effect is calculated by setting the indicator variable for 1993, FOREIGN, | jd RD, , equal to 1, all other 
year indicator variables equal to 0, and all continuous and ordinal variables equal to their median values. 

32 In untabulated results using a Tobit specification, DISTANCE is negatively and significantly related to DURATION, 
which equals the total number of years from the time the firm started using COLI until 1996 for COLI firms, and 0 for 
contro] firms (coeff = —1.88; t = —1.99; one-tailed p- 0.03). The Tobit results are consistent with H4b, firms 
headquartered fa-ther away from the Mid-East BEA region adopt later. : 
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VI. SUMMARY AND CONCLUSION 

This study contributes to the literature on aggressive corporate tax reporting by examining 
how the use of a particular corporate tax shelter, corporate-owned life insurance (COLI), spreads 
over time and azross firms. I develop hypotheses based on four social environment factors derived 
from the literature on innovation-diffusion and mimetic isomorphism and test whether these fac- 
tors help to explain the pattern of COLI adoptions. I find that COLI jadopters are connected 
through board interlocks. Consistent with a cohesion model of diffusion, a direct tie to a prior 
COLI adopter via a board interlock increases the likelihood that a firm acopts the COLI shelter. 1 
find no evidence that firms' connections to prior COLI adopters via a common auditor are related 
to the spread of COLI adoption. I also examine whether imitation of structurally equivalent peers 
helps to explain the spread of COLI participation. However, I find that the number of prior 
adopters in a firm’s industry is not related to the likelihood that it will adept COLI. Finally, I find 
some evidence that COLI use spreads geographically. 
This study specifically examines which diffusing mechanisms help explain the spread of 
COLI use. Not all corporate practices diffuse the same way (Davis and Greve 1997), and the 
importance of a particular diffusion mechanism, like board ties, can wane over time (Mizruchi et 
al. 2006). Although the findings presented here do not necessarily generalize to all corporate tax 
shelters or to the adoption of tax strategies in today's information environment, the results related 
to COLI highlizht the importance of examining firms’ decisions in light of the broader social 
structures in which they operate. Moreover, the theories of innovation-diffusion and institutional 
isomorphism explored in this study are potentially applicable to a variety of accounting practice 
trends. 





APPENDIX 
Accounting Treatment of COLI 


The level of detail, if any, disclosed in the financial statement footnotes regarding COLI plans 
varies considerably from firm to firm.” Investments in life insurance are governed by FASB 
Technical Bulle-in No. 85-4, which prescribes the use of the cash surrender value (CSV) method. 
Under the CSV method, a firm should record the CSV of the life insurance policy as an asset, but 
under the right of setoff, firms can net outstanding policy loans againstithe CSV of the policy, 
rather than record a separate liability (APB No. 10, FASB Technical Bulletin No. 88-2, and FASB 
Interpretation No. 39). Given the right of setoff, firms can participate in the COLI shelter with 
little net effect on their balance sheets." 

As seen in the example below, other than through the tax effect, COLI has little income 
statement effect. In a leveraged COLI transaction, net COLI income (expense) equals increases in 
the cash surrender value (interest credited to the policy) plus any death benefits received less the 
premium expense and the interest expense on the policy loans (FASB /Technical Bulletin No. 
85-4). SFAS No. 109, Accounting for Income Taxes, indicates that the excess of CSV over pre- 
miums paid results in a permanent book-tax difference if the insurance policy is expected to be 
held until the death of the ensured. Thus, COLI programs that produce material book-tax differ- 
ences should be reported in the firm’s tax rate reconciliation. However, firms have considerable 
discretion over how permanent differences are netted and aggregated into specific line items on the 
rate reconciliation. 





33 Material for this section is drawn from Nurnberg (2004). 
^ Court documents reveal that Camelot’s COLI plan was set to maintain a zero net equity balance at the end of each year 
(In re C.M. Holdings, Inc., 254 B.R. 578 (D. Del. 2000). 
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Book-Tax 
Book Adjustment Тах 
Increase in policy cash surrender value 20 (20) 0 
Less: Premium expense (12) 12 0 
Interest expense on policy loans (10) 0 (10) 
Net COLI income (expense) (2) (8) (10) 


Under IRC $72, the interest credited to the cash surrender value of the policy, $20 in the example 
above, is not included in the policyholder's gross income. Because proceeds from life insurance 
receive preferential tax treatment, policy premiums are not deductible for tax purposes. This 
example shows premium expense of $12, which reduces book income, but not taxable income. 
However, the interest expense paid on policy-backed loans, here $10, is tax deductible. 

The numbers chosen for the example here are arbitrary, but show an important feature of the 
COLI shelter. The transaction produces a loss for tax purposes in excess of any economic or 
financial statement loss. Indeed, even though this example produces a $2 pre-tax book loss, at a 35 
percent statutory rate, the $2.80 tax benefit from $8 of net nontaxable income offsets the $2 
pre-tax book loss, generating positive after-tax income of $0.80. 
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ABSTRACT: We examine a potential benefit associated with the initiation of voluntary 
disclosure of corporate social responsibility (CSR) activities: a reduction in firms' cost of 
equity capital. We find that firms with а high cost of equity capital in the previous year 
tend to initiate disclosure of CSR activities in the current year and that initiating firms 
with superior social responsibility performance enjoy a subsequent reduction in the cost 
of equity capital. Further, initiating firms with superior social responsibility performance 
attract dedicated institutional investors and analyst coverage. Moreover, these analysts 
achieve lower absolute forecast errors and dispersion. Finally, we find that firms exploit 
the benefit of a lower cost of equity capital associated with the initiation of CSR disclo- 
sure. Initiating firms are more likely than non-initiating firms to raise equity capital fol- 
lowing the initiations; among firms raising equity capital, initiating firms raise a signifi- 
cantly larger amount than do non-initiating firms. 
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I. INTRODUCTION | 
he last 15 years have witnessed a steadily increasing emphasis ‘оп socially responsible 
| corporate activities around the world. While third parties, such аз KLD Research and 
Analytics, Inc. (KLD), often track and rate the corporate social résponsibility (CSR) per- 
formance of large firms, firms have also become increasingly willing to voluntarily issue standa- 
lone CSR reports in recent уеагѕ.! According to CorporateRegister.com, a private company that 
specializes in tracking CSR reports, few standalone CSR reports were issued in the United States 
before the mid-1990s. However, since then, increasingly more U.S. firms have committed to 
making this type of disclosure. In 2007 alone, large firms issued about 300 CSR reports. Although 
CSR disclosing firms represent only a small fraction of the population of 9. S. publicly listed firms, 
their aggregate market value constituted over 10 percent of the total U.S. ‘market capitalization in 
2007.2 The rapid increase in CSR reporting naturally raises questions among researchers: What are 
the rationales behind this type of voluntary disclosure? What benefits do. firms gain by spending 
resources on compiling and publishing these standalone reports, especially given that CSR per- 
formance ratings are often available to investors through third parties? | 
A number of factors potentially provide answers to these questions, such as the growing 
influence of global enterprises, the intensified scrutiny of corporate impact on the society and the 
economy as a result of a loss of trust after a series of corporate scandals around 2001, and the 
recent rapid growth in ethical/socially responsible investment in the United States and around the 
world. Anecdotal evidence also indicates that firms’ reputations and long-term sales can suffer 
because of poor CSR performance. For instance, Nike struggled for years and invested a great 
amount of financial resources and effort to regain its reputation after the 1997 child labor scandal.* 
We examine one factor, namely, a reduction in firms’ cost of equity' ‘capital, that potentially 
provides an explanation for the increasing trend in CSR disclosure. Among various potential 
factors influencing CSR disclosure decisions, we focus on the cost of equity capital because it 
plays a critical role in a firm’s financing and general operations decisions! Also, corporate execu- 
tives appear to believe that voluntarily communicating information can reduce their firms’ cost of 
capital (Graham et al. 2005). Further, there is a longstanding interest among academics in the 
relation between disclosure and the cost of capital (Diamond and Verrecctlia 1991; Botosan 1997; 
Leuz and Verrecchia 2000; Botosan and Plumlee 2002). р 
To determine whether and how CSR disclosure is related to firms’ cust of equity capital, we 
employ a sample of firms that intersect two CSR data sources: (1) a coraprehensive list of firms 
releasing electronic or hard-copy standalone CSR reports since 1993, collected from various 
sources on the Internet; and (2) the KLD STATS database that provides delailed CSR performance 
ratings for individual firms. Our analyses provide four important insights. First, firms with a high 
cost of equity capital in the previous year are significantly more likely, than others to initiate 
standalone CSR disclosures. Second, the cost of equity capital decreases for CSR-initiating firms 
with superior CSR performance. Third, CSR-initiating firms with superiar CSR performance at- 
tract dedicated institutional investors and analyst coverage. Moreover, these analysts have more 
5 1 


Consistent with McWilliams and Siegel (2001), among others, we define CSR as instances where the company goes 
beyond complianze and voluntarily engages in actions that appear to advance social catises, including committing to 
environmental and human rights protection, providing community support, and so forth. In practice and academic 
research, CSR is often used interchangeably with “sustainability.” We also follow this convention in the paper, 

This figure is based on the mean market cap of $14.47 billion for firms as represented in Table 2 and the total U.S. 
market cap of arcund $15.35 trillion on May 23, 2007. i 

For example, according to the Social Investment Forum (2007), from 1995 to 2005, assets invested in socially respon- 
sible investment grew from $639 billion to $2.29 trillion, and accounted for approximately 11 percent of the total assets 
managed by professional managers. 

See http://www.bandt.com,au/news/25/0c00d225.asp. 
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accurate forecasts and lower forecast dispersion. Finally, corroborating the result on the relation 
between CSR disclosure and the cost of equity capital, CSR-initiating firms are significantly more 
likely than non-initiating firms to conduct seasoned equity offerings (SEOs) in the two years 
following these initiations and among firms conducting SEOs, CSR-initiating firms raise a signifi- 
cantly larger amount of capital than do non-initiating firms. Overall, our evidence is consistent 
with our predictions that a potential reduction in the cost of equity capital motivates firms to 
publish standalone CSR reports and that CSR disclosure by firms with superior CSR performance 
leads to a lower cost of equity capital. 

This study is the first to investigate the impact of standalone voluntary disclosure of general 
CSR issues on the cost of equity capital. We contribute to the literature by extending the traditional 
research on voluntary disclosure beyond the narrow focus of financial disclosure. The extant 
finance and accounting literatures on voluntary disclosure focus primarily on management fore- 
casts or conference calls that are short-term-oriented.? In contrast, CSR disclosure, which is broad 
in scope, is related to a firm's long-term development strategies and performance sustainability. 
Our results provide evidence on the rationales behind and the consequences of the recent trend in 
voluntary CSR disclosure. 

Our study is related to, but differs from, the work of Plumlee et al. (2008) and Richardson and 
Welker (2001). Plumlee et al. (2008) examine the impact of voluntary environmental disclosure 
quality on firm value. We examine a broader concept of CSR, which includes environmental 
protection, community development, corporate governance practices, employee relations, diversity 
practices, human rights, and product quality. In addition, we use a measure of CSR that is different 
from Plumlee et al. (2008), who use a self-constructed index to measure firms’ environmental 
disclosure quality. We use a proxy that indicates whether firms publish CSR reports. Also, the 
information examined by Plumlee et al. (2008) comes from corporate environmental reports as 
well as annual reports and 10-Ks, which reflect both voluntary and mandatory disclosures. The 
standalone CSR reports we examine are voluntary. 

Our study also differs from Richardson and Welker (2001), who examine the relation between 
the cost of equity capital and social as well as financial disclosure. First, we study U.S. firms, 
whereas they examine Canadian firms. The United States and Canada differ considerably in 
institutions related to information disclosure, with the United States having more stringent regu- 
lations than Canada (Richardson and Welker 2001). If more stringent regulations and the associ- 
ated higher level of litigation risk translate into a generally higher level of disclosure credibility, 
then we can observe different relations between disclosure and the cost of equity capital in these 
two countries. In addition, the CSR measure used by Richardson and Welker (2001) is based on 
annual reports, whereas we focus on standalone CSR disclosures. These two forms of disclosure 
differ in depth and breadth of CSR coverage. 

Methodologically, we differ from Plumlee et al. (2008) and Richardson and Welker (2001) by 
employing a lead-lag approach enhanced with two-stage regressions in sensitivity analysis to deal 
with endogeneity and self-selection issues and by exploring the underlying channels, such as 
institutional ownership and analyst coverage, through which CSR disclosure affects the cost of 
equity capital. In sum, we contribute to the literature by complementing and extending Plumlee et 
al. (2008) and Richardson and Welker (2001). 

Section II develops our hypotheses. Section III describes our sample and methodology. Sec- 
tion IV presents empirical evidence on the relation between CSR disclosure and the cost of equity 
capital. Section V summarizes and concludes. 


5 One of the few exceptions is Dietrich et al. (2001), who investigate the effect of the supplemental disclosure of 
forward-looking information on security prices. 
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П. RELATED RESEARCH AND HYPOTHESIS DEVELOPMENT 

Most prior research on the relation between disclosure and the cost of capital focuses on 
financial disclosure (Core 2001; Healy and Palepu 2001; Leuz and Wysocki 2008). The consensus 
appears to be that a negative relation exists between the quality of financial disclosure and the cost 
of capital. Greater disclosure increases investors' awareness of a firm's existence and enlarges its 
investor base, which improves risk-sharing and reduces the cost of capital (Merton 1987). In 
addition, higher quality or more precise firm-specific disclosures decredse the covariance of a 
firm's cash flow with the cash flows of other firms (Hughes et al. 2007; Lambert et al. 2007), 
which essentially reduces the betas of individual firms and, hence, the cost of equity capital. 
Similarly, greater disclosure can lead to reduced information asymmetry among investors or be- 
tween managers and investors. When the level of disclosure is inadequate and some investors are 
perceived to be better informed than others, informationally disadvantaged investors price-protect 
themselves and become less willing to trade. The resultant illiquidity increases the bid-ask spread 
and transaction costs (Verrecchia 2001), which leads to a higher required rate of return or cost of 
equity capital (Amihud and Mendelson 1986). 

These mechanisms likely apply to both financial and nonfinancial disclosure, as long as the 
information concerned is value-relevant. Indeed, a fair amount of research suggests that CSR 
information is value-relevant (Margolis and Walsh 2001; Orlitzky et al. 2003; Al-Tuwaijri et al. 
2004). Of course, CSR practices can affect firms' financial performance and value through chan- 
nels other than those related to financial disclosure. For instance, voluntary socially responsible 
behavior can help firms avoid government regulation and, therefore, reduce compliance costs. In 
addition, socially responsible firms appeal to consumers who care about the corresponding social 
issues, which leads to superior sales and financial performance (Lev et al! 2010). Socially aware 
investors are willing to pay a premium for the securities of socially respansible firms (Anderson 
and Frankel 1980; Richardson and Welker 2001). Perhaps more important, some CSR projects 
have direct implications for positive cash flow even in the near future. For example, practices 
related to protecting the environment and improving employee welfare can reduce potential liti- 
gation and pollution cleaning costs, boost employee morale. And, thereby; production efficiency. 
These arguments highlight the importance of CSR disclosure in reducing information asymmetry 
and uncertainty related to factors affecting firm value (Rodriguez et all 2006), which in tum 
reduces the cost of equity capital. 

Nevertheless, a straightforward generalization of the cost of capital effect from financial 
disclosure to nonfinancial CSR disclosure is not always obvious. Standalone CSR reports are 
currently subject to very limited regulatory guidance. There is a common concern about the 
usefulness of this type of disclosure because of noncomparability and potential credibility issues 
and opportunistic behaviors of firms (Ingram and Frazier 1980; Hobson and Kachelmeier 2005). 
In the end, whether voluntary CSR disclosure reduces a firm's cost of equity capital is an empirical 
question. 

It is important to note that CSR performance ratings of large firms! are often available to 
investors through third parties. These ratings could be directly associated with the cost of equity 
capital of these firms. However, ratings alone are unlikely to provide sufficient information for 
investors to assess firms' overall CSR performance. Detailed CSR disclosures potentially provide 
additional information necessary for investors to assimilate these summary ratings.” Further, vol- 
untarily disclosing CSR activities demonstrates firms’ confidence in their CSR performance, which 








$ Although some accounting and consulting firms provide voluntary assurance service (Simnett et al. 2009), there is not 
yet a government standard that regulates this service, and the assurance industry is still in its infancy. 
An obvious analogy is the usefulness of footnote disclosures and management discussions in supplementing financial 
statements. 
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sends a positive signal to investors, or, in the case of poor CSR performance, allows firms to offer 
explanations. Therefore, CSR disclosures contain information beyond that contained in CSR per- 
formance ratings. 

Some firms also disclose information on CSR activities in their annual reports or filings with 
the SEC. However, a firm's voluntary compilation and publication of standalone CSR reports 
demonstrates its special effort and commitment to improving transparency regarding long-term 
performance and risk management. More importantly, compared with the CSR information pro- 
vided in annual reports or 10-Ks, standalone CSR reports are more comprehensive and contain 
significantly more details.) Therefore, standalone CSR reports likely provide incrementally useful 
information for investors to evaluate firms’ long-term sustainability. Focusing on standalone CSR 
reports can thus improve the power of our tests and shed light on this new form of voluntary 
nonfinancial disclosure. 

Our first hypothesis predicts that a possible reduction in the cost of equity capital provides an 
incentive for firms to publish CSR reports. Frankel et al. (1995) find that firms increase their level 
of voluntary disclosure to raise capital in the future at a lower cost, which suggests that firms with 
a relatively higher cost of capital likely have a greater incentive to enhance disclosure. Lending 
support to the cost of capital incentive for disclosure, Sletten (2008) finds that stock price declines, 
which imply an increase in firms’ cost of equity capital, induce managers to disclose more 
information.’ 

Of course, endogeneity and self-selection issues can arise if we examine a contemporaneous 
relation between CSR disclosure and the cost of equity capital. On the one hand, if CSR disclosure 
is motivated by a firm’s desire to reduce its high cost of equity capital, then we should find a 
positive relation between CSR disclosure and the cost of the equity capital. On the other hand, if 
CSR disclosure leads to a lower cost of equity capital, then we should find a negative relation 
between CSR disclosure and the cost of equity capital. Therefore, the contemporaneous relation 
between CSR disclosure and the cost of equity capital could be ambiguous. To address the poten- 
tial endogeneity and self-selection issues related to CSR disclosure and the cost of equity capital, 
we employ a lead-lag approach in our main analyses and state our first hypothesis below: 


H1: The likelihood that a firm will disclose its corporate social responsibility activities is 
positively associated with its cost of equity capital in the previous year. 


If CSR disclosure provides information that is incremental to information provided in third- 
party CSR performance ratings or other information dissemination channels such as annual reports 
or 10-Ks, then the preceding discussion suggests that CSR disclosure should lead to a lower cost 
of equity capital. This logic suggests the following hypothesis: 


H2: Corporate social responsibility disclosure is associated with a subsequently lower cost of 
equity capital. 

Support for H1 and H2 would provide justification for the rationales behind and the conse- 

quences of CSR disclosure. We also test a corollary of H1 and H2 by examining whether disclos- 


In untabulated analyses and relying on manual data collection, we compare CSR-related content in the first-time 
standalone CSR reports and annual reports (or 10-Ks in the absence of annual reports) of 50 firms out of our final 
sample of 213 firms. We find that, on average, standalone CSR reports are significantly longer (28.3 pages versus 1.5 
pages) and cover significantly more CSR issues (6.4 issues versus 1.5 issues) compared to annual reports or 10-Ks. The 
inference of the above comparison is also supported by a comprehensive survey conducted by KPMG (2008), which 
finds that among the largest 100 U.S. firms, only about 1 percent of them adequately integrate CSR reports into their 
annual reports. 

However, the result documented by Sletten (2008) could be attributable to either the numerator (cash flow) effect or the 
denominator (cost of capital) effect, or both. 
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ing firms seek external financing after CSR disclosures. If CSR disclosure is motivated by firms' 
desire to reduce the cost of equity capital, then these firms will be more likely than non-disclosing 
firms to raise equity capital after their CSR disclosures to exploit the reduction in their cost of 
equity capital, and they will also strive to raise a larger amount. While we formulate our predic- 
tions based on CSR disclosure, in the empirical analysis we focus on CSR-disclosure-initiating 
firms since initial reports likely contain more information than mundane continuing reports. 





Ш. SAMPLE AND METHODOLOGY 
Sample Description 


CSR disclosure policies can be sticky across years. Therefore, we focus on first-time standa- 
lone CSR reports. We collect standalone CSR reports issued by U.S. firmis from various sources, 
including (1) Corporate Social Responsibility Newswire, (2) CorporateRegister.com, (3) Internet 
searches, and (4) company websites. The first two sources are the two leading organizations 
collecting and disseminating news and information related to CSR. We verify our CSR reporting 
sample by checking whether we can find their actual standalone CSR reports. 9 

In our main analyses, we control for the relative social responsibility) performance of sample 
firms, as proxied for by the KLD social performance rating scores. Our|final sample comprises 
firms that are in both the KLD STATS and Compustat databases. KLD evaluates CSR performance 
for all covered firms along a variety of dimensions, regardless of whether, they release standalone 
reports." Starting from 1991, KLD STATS rated approximately 650 companies every year, com- 
prising mainly all firms in the S&P 500 and Domini 400 Social SM Index. During 2001 to 2002, 
KLD expanded its coverage to include the largest 1,000 U.S. companies Ну market capitalization. 
Since 2003, it has covered the largest 3,000 U.S. companies based on market capitalization. 

Table 1, Panel A shows the industry distribution, based on Barth et al.’s (1998) industry 
classifications, of CSR reports and disclosing firms. During the 1993—2007 period, 294 firms 
issued a total of 1,190 standalone CSR reports.” The Utilities industry has the largest proportion 
(30.4 percent) of firms publishing CSR reports, while the Services and Insurance/Real Estate 
industries have the lowest proportion of disclosing firms (2.15 percent and 0.40 percent, respec- 
tively}. Consistent with the broad scope of CSR disclosure, many non-pollution-prone industries 
including the Food and Retail industries also actively disclose their social performance. After 
eliminating 81 firms because of missing data, our final sample contains 213 disclosing firms. The 
Utilities industry constitutes the largest proportion of the final sample (13.4 percent). Table 1, 
Panel B presents the distribution by year of CSR reports and disclosing firms. Overall, there is a 
steadily increasing trend in the number of CSR reports over time from 8 in 1993 to 184 in 2007. 
The average report length nearly doubles from about 20 pages in the early 1990s to more than 40 
pages in the most recent years. On average, a CSR report has 36 рарез. 1 








10 Tt is tempting to examine the information content of CSR disclosures. However, this test is hampered by the lack of 
information on the exact reporting dates of the reports. Nevertheless, we conduct an even: study based on the reporting 
months of the reports. We find that (1) during the CSR reporting month, there is no difference in raw and market- 
adjusted returns between high and low CSR performance firms; (2) during the three-month period following the CSR 
reporting month, high CSR performance firms appear to do slightly better than low CSR! performance firms, based on 
market-adjusted returns; and (3) there is no difference in returns between CSR reporting months and non-CSR reporting 
months, 
The Appendix to this paper lays out the main categories of CSR issues employed by KLE in its rating process and also 
the average rating scores across industries. 
12 Sometimes а firm publishes multiple CSR reports, often discussing different CSR-related issues such as environmental 
versus non-environmental matters, in a single year. When that is the case, we combine them into one firm-year 
observation. | 
13 The statistics for page numbers are based on all CSR reports published in the year, not jist on first-time reports. 
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Empirical Models and Variable Definitions 


Past Cost of Equity Capital and Current-Year CSR Disclosure 

To test H1, we examine whether a high cost of equity capital in the previous year gives firms 
an incentive for CSR disclosure in the current year. In the empirical regression model, we control 
for other determinants of CSR disclosure to parse out potential confounding effects. However, the 
current literature provides limited information on what motivates a firm's CSR disclosure decision. 
As CSR disclosure is part of a firm's overall voluntary disclosure strategy, we identify potential 
factors from the voluntary disclosure literature that influence a firm's decision to commit to CSR 
disclosure. Our logistic regression model is specified as follows: 


log[prob(DISCI, )/(1 — prob(DISCI, ))] = Bo BCOC;,. + B»PERFORM,, | 
+ BHICONCERN, „1 + PaSIZE;,. 1 
+ БШТІСАТІОМ,, 1% BsROA;, 1 
+ ByCOMPETITION, , , + ВьЕІМ, 1 
+ ВоТОВІМО, у + ByoLEV; ,., + By GLOBAL; 1-1 
+ BioLIQUIDITY, , 1 + В.зАВ8 ЕМ, 
+ B14CIG; t1 + 210; + ХУЕАВ: + £i, (1) 


where DISCI, , is an indicator variable that equals 1 if firm i discloses a standalone CSR report for 
the first time in year 1 (initiating firm-years or initiators), and 0 (non-initiating firm-years or 
non-initiators) otherwise. Therefore, the control group (DISCI = 0), namely, non-initiators, in- 
cludes all years of firms that never issue CSR reports and the years before and after CSR-initiating 
firms’ first-time reports. 

Our main variable of interest, the cost of equity capital in the year prior to first-time CSR 
disclosure, COC, is the ex ante or implied cost of equity capital, calculated using three different 
models, namely, those of Gebhardt et al. (2001), Claus and Thomas (2001), and Easton (2004). 
The mean of the three measures (COC_AVG) serves as our proxy for the cost of equity capital. To 
implement the estimation, we obtain expected future earnings per share from I/B/E/S and market 
price and dividend per share from Compustat. 

We include a number of control variables in the regression. PERFORM is the total KLD score 
of CSR strengths, which we use to proxy for firms’ CSR performance. Firms with better social 
performance have a greater incentive to disclose (Dye 1985). The KLD database is widely used in 
CSR research (Graves and Waddock 1994; Berman et al. 1999; Baron et al. 2009). Waddock 
(2003, 369) regards it as “the de facto (CSR) research standard at the moment.” '* KLD ranks 
firms’ CSR performance in seven main categories: (1) community, (2) corporate governance, (3) 
diversity, (4) employee relations, (5) environment, (6) human rights, and (7) product.!5 We adjust 
raw CSR strength scores each year by industry medians to get relative performance scores that are 
comparable across industries. 


4 of course, there is also no lack of criticism of the KLD database. For example, KLD uses indicator variables to describe 
firms’ CSR performance. This is a crude methodology and potentially suffers from loss of information. Chatterji et al. 
(2009) show that KLD environmental strengths do not accurately predict pollution levels or compliance violations, and 
that KLD ratings do not optimally use publicly available data. 

15 The rankings are based on information obtained from surveys, financial statements, government documents, peer- 
reviewed legal journals, and reports from mainstream media. KLD defines a set of potential strengths under each 
category and assigns a value of 1 if a strength exists, and a value of 0 otherwise. See the Appendix for more details on 
KLD’s rating categories. 
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We control for firm size (SIZE) because size captures various factors rhotivating firms to issue 
CSR reports such as public pressure or financial resources (Lang and Luridholm 1993). We mea- 
sure SIZE as the natural logarithm of the market value of common equity at the beginning of each 
year. Skinner (1997) argues that firms facing a higher level of litigation irisk (LITIGATION) are 
more likely to make voluntary disclosure to preempt potential lawsuits. LITIGATION is an indi- 
cator variable that equals 1 if a firm operates in a high-litigation industry (SIC codes of 2833- 
2836, 3570-3577, 3600-3674, 5200-5961, and 7370), and 0 otherwise (Francis et al. 1994; Mat- 
sumoto 2002). As firms with better financial performance likely have more resources to practice 
CSR activities and produce CSR reports, we include return on assets (ROA), computed as income 
before extraordinary items scaled by total assets at the beginning of each| year. 

Dye (1985) suggests that proprietary costs arising from product market competition can 
reduce disclosure incentives. Hence, we control for industry competition (С ОМРЕТІТІОМ), which 
is proxied by the Herfindahl-Hirschman Index multiplied by —1. This index is computed as the 
sum of the squared fractions of sales of the 50 largest firms in an industry (industries are defined 
based on the two-digit SIC codes). In cases where there are fewer than 50 firms in an industry, we 
use ali firms in the industry to calculate market shares. In addition, firmis raising capital in the 
public market have a greater propensity to make voluntary disclosures (Frankel et al. 1995). We 
control for a firm's financing activities (FIN) by assessing the amount of debt or equity capital 
raised by the firm during the year scaled by total assets at the beginning of the year. Following 
Richardson et al. (2004), FIN is measured as the sale of common and preferred shares minus the 
purchase of common and preferred shares plus the long-term debt. issuance minus the long-term 
debt reduction. | 

We also control for growth opportunities (TOBINQ) because firms in an expansionary period 
are more financially constrained and have fewer resources for CSR activities and disclosure. 
However, growth firms also tend to have higher levels of information asymmetry, which could 
induce managers to make more disclosures to attract potential investors. The net effect is hence 
unknown ex ante. TOBINQ is Tobin's Q, defined as the market value of common equity plus the 
book value of preferred stock, book value of long-term debt and current liabilities, scaled by the 
book value of total assets. We include the debt ratio (LEV) in the model because debt servicing 
plays a monitoring role and debt holders demand greater disclosure (Leftwich et al. 1981). We 
define LEV as the ratio of total debt divided by total assets. . 

In addition, firms with a global focus, especially those operating in emerging markets, face 
greater pressure to commit to social performance and are accordingly тоге likely to provide CSR 
disclosure. GLOBAL is an indicator variable that equals 1 if a firm reports foreign income, and 0 
otherwise. Further, managers have incentives to increase the liquidity of their firms' stock in order 
to issue equities or sell shares of their firm obtained from options or other incentive compensation 
plans. One way to increase liquidity is to improve transparency and supply more information to 
investors. Our liquidity measure, LIQUIDITY, is the ratio of the number ‘of shares traded in the 
year to the total shares outstanding at the year-end. 

Finally, CSR disclosure could be correlated with the general Аш» policies and financial 
transparency of firms. To control for this possibility, we include two variables to proxy for firm 
financial disclosure quality and voluntary disclosure policy: earnings quality (ABS ЕМ) and man- 
agement earnings forecasts (CIG). We use the absolute value of abnormal accruals from the 
modified Jones (1991) model, based on Dechow et al. (1995), to proxy for earnings quality 
(Francis et al. 2008). 16 Following prior studies that use management forecasts as a direct measure 


1 
i 
1 


16 Using the original Jones (1991) model or an alternative version developed by Dechow et al. (2003, 359, Equation (2b)) 
yields similar results, 


| 


t 


The Accounting Review January 2011 
American Accounting Association 


Voluntary Nonfinancial Disclosure and the Cost of Equity Capital 69 


of a firm's disclosure policy (Rogers and Van Buskirk 2009), we define CIG as an indicator 
variable that equals 1 if a firm issues at least one earnings forecast in the year, and 0 otherwise. In 
all specifications of the model, we include industry and year indicators to control for potential 
industry and year effects. 


Effect of CSR Disclosure on the Future Cost of Equity Capital . 
Hypothesis 2 predicts that CSR disclosure leads to a lower cost of equity capital. We test H2 
by estimating the following regression model: 


А%СОС, а = Bo + ByDISCI,, + ByASIZE,,  BABETA, , + ByALEV;,, + BSAMB;, 
+ BsALTG; + BjALNDISP, + SIND, + УУЕАВ + е, Q) 


where АФСОС; is the percentage change іп the cost of equity capital from year t to year t+1. 
The control variables also.adopt the change form. А negative coefficient on DISCI would support 
H2. 

The control variables are derived from prior research. Fama and French (1992) find that 
expected returns are negatively associated with firm size and positively associated with the book- 
to-market ratio. Hence, we include firm size (SIZE) and the market-to-book ratio (MB). The 
market model BETA, which is estimated using CRSP daily data for each year, is included to 
control for systematic risk. Gebhardt et al. (2001) and Gode and Mohanram (2003) find that the 
implied cost of equity capital is positively associated with long-term growth rate. We therefore 
include an empirical proxy of long-term growth rate based on I/B/E/S analyst EPS forecasts 
(LTG), which is measured as the difference between the two-year-ahead consensus EPS forecast 
and the one-year-ahead consensus EPS forecast scaled by the one-year-ahead consensus EPS 
forecast. Gebhardt et al. (2001) and Dhaliwal et al. (2005) find that analyst forecast dispersion is 
negatively associated with the implied cost of equity capital Thus, we include analyst forecast 
dispersion (LNDISP), which is calculated as the logarithm of the standard deviation of analyst EPS 
forecasts divided by the consensus forecast. We include leverage (LEV) because Fama and French 
(1992) suggest that the cost of equity capital increases as the degree of leverage increases. АП 
other variables are as defined earlier. 

Although firms may be motivated by a possible reduction in the cost of equity capital when 
deciding whether to issue a CSR report, from the perspective of investors, CSR disclosure per se 
may not necessarily warrant a lower cost of equity capital. Corporate managers could attempt to 
manage public impressions through such disclosures; therefore, CSR information can be self- 
serving and noncredible (Cormier and Magnan 2003; Hobson and Kachelmeier 2005). Investors 
are likely to have a favorable perception if a firm actually performs well in its CSR practices 
relative to its peers. To incorporate this possibility, we augment Equation (2) with a measure of a 
firm's relative CSR performance from KLD (HIPERFORM): 


A%COC 44; = Bo + ByDISCI, + BHIPERFORM,, + ВІСІ, + HIPERFORM;, 
+ BASIZE, + BsABETA;,  BgALEV;, + BjAMB, + BuALTG;, 
+ BoALNDISP;,, + ej, (3) 


where HIPERFORM is an indicator variable that equals 1 if a firm’s CSR performance score, 
PERFORM, is higher than its industry median (in other words, if the firm is a superior CSR 
performer in its industry), and 0 otherwise. All other variables are as defined earlier. We expect the 
effect of DISCI * HIPERFORM to be negative. In an additional test, instead of using the interac- 
tion term between DISCI and HIPERFORM, we estimate Equation (2) within the high and low 
partitions of CSR performance scores. Using partitioned subsamples sacrifices some power due to 
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reduced sample size, but has the benefit of flexibility that allows the effects of other variables to 
also vary based on high or low levels of CSR performance scores. 

Endogeneity and self-selection could potentially affect our results. In our main analysis, we 
use a lead-lag approach to tackle these issues. To further enhance inferencés based on our lead-lag 
approach, we adopt the Heckman and Hausman two-stage procedures andi repeat our main analy- 
ses. The Heckman two-stage procedure introduces the inverse Mills ratio into the second-stage 
OLS regression to control for self-selection bias that is related to CSR disclosure. We obtain 
qualitatively similar results using the Heckman two-stage procedure. 17 The Hausman test deals 
with potential endogeneity in the data. We conduct the Hausman test and find that endogeneity 
does not qualitatively affect our main results. | 


IV. RESULTS 
Descriptive Statistics 

Table 2, Panel A provides descriptive statistics for the variables included in Equation (1) for 
the full sample and separately for initiators and non-initiators. The cost of equity capital before 
CSR disclosure is significantly higher (p = 0.04) among CSR initiators (12.86 percent) than 
among non-initiators (11.98 percent). This difference is also reflected ir a significantly positive 
correlation coefficient between DISCI and COC_AVG in Table 2, Panel B, providing initial sup- 
port for H1. 

Consistent with the theory on voluntary disclosure, firms voluntarily publishing standalone 
CSR reports tend to have superior CSR performance (PERFORM) relative to their industry peers. 
The difference in CSR performance between the two groups (1.613 for initiators versus —0.166 
for non-initiators) is significant (p < 0.01). The correlation between DISCi and PERFORM is also 
significantly positive though at a relatively moderate level of 0.09 based on the Spearman corre- 
lation and 0.13 based on the Pearson correlation (see Table 2, Panel B). This highlights the 
importance of including PERFORM in our regression equations. 

Initiators are significantly larger (SIZE: 9.147 for initiators versus 5. 783 for non-initiators, p 
« 0.01) and more profitable (ROA: 0.051 for initiators versus 0.015 for non-initiators, p « 0.01) 
than non-initiators, lending support to the financial resources argument for CSR disclosure. Con- 
trary to the proprietary information argument, initiators tend to observe greater industry competi- 
tion than non-initiators (COMPETITION: —0.060 for initiators versus —0. 069 for non-initiators, p 
= 0.02). 

Initiators have a significantly lower level of financing than non-initiators (FIN: —0.019 for 
initiators versus 0.043 for non-initiators, p < 0.01). The negative financing level for initiators 
implies that these firms, in net effect, either have repurchased stock or redeemed their debts. Firms 
normally conduct repurchases when they believe that their stock is undervalued, indicating a high 
cost of equity capital, which in turn provides an incentive for managers tc increase disclosure and 
transparency levels. Similarly, the redemption of mature debts likely implies that firms need future 
financing to maintain a normal capital level. These firms would also be willing to increase their 
level of disclosure if doing so helped them to lower the cost of borrowing. 

Initiators have a higher degree of leverage than non-initiators (LEV: 0 265 for initiators versus 
0.221 for non-initiators, p < 0.01). Those with a higher level of global operations are also more 
likely to publish CSR reports (GLOBAL: 0.460 for initiators versus 0.219; for non-initiators, p < 
0.01), consistent with the notion that these firms attract more attention in, the international com- 
munity. Contrary to the notion of disclosing information to improve liquidity, initiators actually 


17 The only exception is for the test of analyst forecast errors. Among better CSR Betis the coefficient on DISCI is 


positive and insignificant. : 
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have higher liquidity levels than non-initiators (LIQUIDITY: 1.387 for initiators versus 1.247 for 
non-initiators, p = 0.04). Finally, initiators have better financial disclosure as manifested in their 
more frequent management forecasts (CIG: 0.646 for initiators versus 0.527 for non-initiators, 
р < 0.01) and better earnings quality (ABS. EM: 0.032 for initiators! versus 0.066 for non- 
initiators, p « 0.01) than non-initiators. 


Cost of Equity Capital and the Likelihood of CSR Disclosure 


Hypothesis 1 predicts that a firm's likelihood of disclosing its corporate social responsibility 
activities is positively associated with its cost of equity capital in the previous year. We report the 
regression results for Equation (1) in Table 3. In Column I, we include all first-time reporting 
firm-year observations. In Column II, we exclude first-time reports that primarily discuss environ- 
mental issues, following Simnett et al. (2009). In Column Ш, we examine the robustness of our 
results to the exclusion of the Utilities industry. 

Across all three specifications of the dependent variable, the cost|of equity capital, CO- 
С. AVG, in year 2-1, is significantly positively associated with а firm's likelihood of voluntarily 
issuing a standalone CSR report in year t (coeff. = 0.049, p < 0.01; coeff. |= 0.052, р < 0.01; and 
coeff. = 0.062, р < 0.01 in Columns I, П and III, respectively), consistent with H1, which posits 
that a higher past cost of equity capital is associated with a greater likelihood of voluntary CSR 
disclosure in the current year. In Column I, for instance, holding other factors constant, when the 
prior year cost of equity capital increases by one percentage point, the odds of initiating standalone 
CSR disclosure increase by 5.02 percent. | 

The coefficient estimates of the control variables are generally consistent with the univariate 
comparisons in Table 2. One exception is LIQUIDITY, which reverses direction. The significantly 
negative coefficient suggests that firms with lower levels of liquidity are| more likely to publish 
CSR reports, consistent with our original conjecture. The effects of financial disclosure quality, 
ABS_EM, and management forecast, CIG, are no longer significant. 


CSR Disclosure and the Future Cost of Equity Capital 

Hypothesis 2 predicts that voluntary CSR disclosure leads to a lower future cost of equity 
capital. Table 4, Panel A compares initiators and non-initiators and Table 4, Panel B presents the 
regression results. In Column I (Equation (2)), the coefficient on DISCI is insignificant (coeff. = 
—0.037, p > 0.50). It appears that CSR disclosure per se is not significantly associated with a 
change in a firm's future cost of equity capital. In Column II (Equation (3}), we consider whether 
a firm has superior CSR performance relative to its industry peers. The interaction term between 
DISCI and HIPERFORM is significantly negative (coeff. = —4.618, p « 0.01), consistent with 
H2, which posits that CSR disclosure reduces the cost of equity capital.) Combining the main 
effect of DISCI and the effect of the interaction term between DISCI and HIPERFORM in Column 
П, we infer that superior CSR performers enjoy a 1.833 percent reduction in the cost of equity 
capital when they produce standalone CSR reports for the first time. In Columns Ш and IV, we 
obtain similar results when we exclude environmental reports and the Utilities industry, respec- 





18 We perform a sensitivity test by restricting the analysis to firm-year observations of CSR reporters in a pre-post setting. 
Specifically, we focus only on disclosing firms and still use the change specification of tbe dependent variable ACO- 
С. AVG. DISCI is an indicator variable that equals 1 for the first reporting year and equals 0 before or after the first 
reporting year of a disclosing firm. The purpose of this examination is to show that a reduction in the cost of equity 
capital occurs immediately after the first reporting year and to alleviate the concern that tle size of initiator sample is 
small relative to the universe of firm-year observations used in the main test. We obtain similar results and inferences, 
namely, firms with superior CSR performance enjoy a reduction in the cost of equity capital if they publish standalone 
CSR reports. 
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tively. Overall, the evidence is consistent with our H2 that CSR-disclosing firms with superior 
CSR performance achieve a reduction in the cost of equity capital.’ 

Table 4, Panel C presents the results from estimating Equation (2) within the two subsamples 
partitioned based on annual industry medians of CSR performance (PERFORM). Consistent with 
the results in Panel B, we find a significantly negative coefficient on DISCI (coeff. = —1.777, p 
= 0.04) in the high CSR performance subsample. This coefficient indicates that voluntary CSR 
disclosure yields a 1.77 percent reduction in the cost of equity capital. In the low CSR perfor- 
mance subsample, there is no significant association between CSR disclosure and the change in 
the cost of equity capital.” 


Potential Mechanisms Linking CSR Disclosure and the Cost of Equity Capital 


The above results suggest that CSR disclosure combined with superior CSR performance is 
associated with a reduction in the cost of equity capital. Below, we provide evidence on the 
potential underlying mechanisms through which voluntary CSR disclosure lowers the cost of 
equity capital. We focus on two types of financial intermediaries: institutional investors and finan- 
cial analysts. 


CSR Disclosure and Institutional Investors 

Shleifer and Vishny (1986) suggest that the large equity stakes in the invested firms and the 
high levels of sophistication of these investors enable them to reduce agency cost problems and 
the extent of information asymmetry between managers and shareholders, an effect that leads to a 
reduction in the cost of equity capital. We consider three different types of institutional investors: 
dedicated (DED), transient (TRA), and quasi-indexer (QIX) institutional investors. Dedicated in- 
stitutional investors are more likely to play monitoring and governance roles than the other two 
types (Bushee 1998). To determine whetber CSR disclosure attracts institutional investors, we 
follow Bushee and Noe (2000) and estimate the following model: 


AINST ма = Во + BıDISCI, + В›Н1РЕВЕОВМ, + ВІСІ, * HIPERFORM,, + BAINST,, 1 
+ BsAMRET, , + ВЕТУОГ 1 + ВтАМУ,, + B&BETA, 1 + BsIRISK; , 1 
+ Bi) ALEV, + By ADP, + BAEP; + B AMB; + ВА СК, + Bis ARATE;, 
+ Ві6АЗНК6;, + М: , + XYEAR,;, + EE (4) 


19 Tn alternative specifications of the model, we examine the effects of two other variables proxying for firms’ effort and 
commitment to better CSR disclosure. (1) We identify firms that provide assurance (ASSURANCE) of their reports 
through independent third parties, most often Big 4 accounting firms and international consulting companies. Simnett et 
al. (2009) provide evidence that firms seeking to enhance the creditability of their reports and their corporate reputation 
are more likely to have their sustainability reports assured. (2) We also assess the effect of the length of each CSR report 
(LENGTH) relative to the average report length oi the disclosing firm's industry (Leuz and Schrand 2008). Of course, 
ASSURANCE and LENGTH are not independent of DISCI. We find that, conditional on first-time CSR disclosure 
(DISCI), external assurance and long report length further reduce the cost of equity capital. Specifically, when we use 
the ASSURANCE indicator (equals 1 with an assurance, and 0 otherwise), the coefficient on DISCI ж HIPERFORM is 
negative and significant (coeff. = —3.523, p < 0.01) and the coefficient on DISCI * HIPERFORM ж ASSURANCE is 
negative and significant (coeff. = —3.540, р = 0.08). Therefore, assurance doubles the effect of CSR disclosure. When 
we use the LENGTH indicator (equals 1 if longer than the industry-year median, and 0 otherwise), the coefficient on 
DISCI * HIPERFORM is negative and significant (coeff. = —2.574, p = 0.02) and the coefficient on DISCI * 
HIPERFORM * LENGTH is negative and significant (coeff. == —3.930, p == 0.06). Therefore, a long report more than 
doubles the effect of CSR disclosure. 

The coefficient on DISC] is not significant but positive, if anything, for poor CSR performers. It is possible that 
disclosing poor CSR performance could actually be a signal of high risk or firm weakness and, therefore, the cost of 
equity capital could actually go up. This does explain why the direct effect of DISCI is insignificant in the pooled 
regression. 
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where A denotes a change from year ¢ to year 1+1. INST represents stock ownership by dedicated 
(DED), transient (TRA), or quasi-indexer (ОРО) institutional investors. MRET is the market- 
adjusted buy-and-hold stock return measured over the year. TVOL, a liquidity proxy, is the average 
monthly trading volume relative to total shares outstanding. JRISK is the logarithmic transforma- 
tion of the standard deviation of market-model residuals calculated using daily stock returns. Beta 
(BETA), debt ratio (LEV), and IRISK capture firm risk along different dimensions. DP is the ratio 
of dividends to the market value of equity. EP is the ratio of income before extraordinary items to 
the market value of equity. $GR is the percentage change in annual sales. We include DP, EP, MB, 
and SGR to control for changes in firms' fundamentals that can affect the investment decisions of 
institutional investors (Bushee 2001). RATE is the S&P stock rating (9 = А+, 8 = A, 7 = А-, 6 
= B+, 5 = В, 4 = В-, 3 = C, 2 = D, 1 = not rated), which captures the preference of 
institutional investors for well-reputed firms (Del Guercio 1996). SHRS is the logarithmic trans- 
formation of shares outstanding, and its change form proxies for equity issuance or repurchases 
that affect both institutional investor following and firms’ disclosure policies. All other variables 
are as defined earlier. | 

Table 5, Panel A presents comparisons of one-year-ahead holdings and changes in holdings by 
the three types of institutional investors between initiators and non-initiatcrs. Overall, the univari- 
ate comparisons do not reveal significant differences between initiators and non-initiators. If 
anything, we observe a greater decrease in transient institutional holding among initiators com- 
pared to non-initiators (p = 0.04), even though the level of this type of holding is still slightly 
higher among initiators than among non-initiators (p = 0.07). - 

Table 5, Panel B displays the regression results. There is weak evidence that initiating firms 
with superior CSR performance attract more dedicated institutional investors. The coefficient on 
DISCI * HIPERFORM is marginally significantly positive (coeff. — 0.414, p — 0.16). To further 
examine this issue, we run regressions without the interaction term in thé two subsamples parti- 
tioned based on annual industry medians of CSR performance for dedicated institutional investors. 
We report the results in Table 5, Panel C. We observe a significantly positive coefficient on DISCI 
(coeff. = 0.438, p = 0.01) for the superior-performance group, whereas the coefficient on DISCI 
for the low-performance group is insignificant. In untabulated tests, we do not find a significant 
association between transient or quasi-indexer institutional investor holdings and the initiation of 
CSR disclosure for the full sample or the partitioned subsamples. 

In sum, the evidence in this subsection suggests that voluntary CSR disclosure attracts dedi- 
cated institutional investors, who have long investment horizons and play! monitoring and gover- 
nance roles. Consistent with our previous evidence that superior CSR performers enjoy a reduction 
in the cost of equity capital through CSR disclosure, the effect of CSR disclosure on dedicated 
institutional ownership is stronger if disclosing firms have CSR performance superior to their 
industry peers. | 


CSR Disclosure and Analyst Forecasts 

We also examine three questions related to financial analysts and CSR disclosure. First, we 
explore whether financial analysts are more willing to cover firms after ttiey initiate CSR disclo- 
sure. Second, we investigate whether the level of forecast accuracy increases and finally we 
determine whether forecast dispersion decreases when CSR reports are available. Increased levels 
of analyst coverage and forecast accuracy and a reduction in the level of forecast dispersion have 
the potential to lower the cost of equity capital. To determine the impact of CSR disclosure on the 
behavior of financial analysts, we run the following three regressions following Lang and Lund- 
holm (1996) and Ali et al. (2007): 
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АСОУЕКАСЕ, n1 = Bo + B,DISCI,,+ B)HIPERFORM,, + B3DISCI, , * HIPERFORM,;, 
+ BAASIZE, ‚+ BSASTDROE;, + BgAINVPRICE, + B,ARETVAR;, 
+ BRARD; + %АКОА,,+ ВА СОКЕ;, + Eip (5) 


А|ЕЕ] 441 = Bo + B1DISCI;, + B)HIPERFORM,, + B,DISCI,, ж НІРЕКЕОКМ, + BASIZE;, 
+ BsASTDROE, , + ВАСНЕР5, ‚+ ВЈАКО + BgAROA;, + BoACORR; + в, (6) 


ADISP; ал = By + B,DISCI,, + BHIPERFORM,, + В І8СІ,,% HIPERFORM,;, + BsASIZE;, 
+ BSASTDROE; ‚+ BSACHEPS, , + BjARD; , + BsAROA;, + BjACORR,, + в, 
(7) 


where COVERAGE is the 12-month average of the number of analysts who issue annual earnings 
forecasts captured in the I/B/E/S database for a specific firm; IFEI is the absolute value of the 
12-month average of analyst forecast errors, which is defined as actual earnings minus the mean 
forecast, deflated by the stock price at the beginning of the fiscal year; and DISP is the 12-month 
average of the standard deviations of analyst forecasts, deflated by the stock price at the beginning 
of the fiscal year. 

We include a number of control variables derived from prior research. We include firm size 
(SIZE) because larger firms have more potential brokerage or investment banking businesses for 
analysts’ brokerage houses (Bhushan 1989), which affects analyst forecasting behavior. We in- 
clude the inverse of stock prices (INVPRICE) because Brennan and Hughes (1991) suggest that it 
proxies for the brokerage commission rate. Analysts are more likely to follow firms with higher 
levels of return variability because the anticipated trading benefits based on private information on 
these stocks are greater (Bhushan 1989). We therefore include STDROE, which is measured as the 
standard deviation of ROE in the preceding four quarters, and RETVAR, computed as the daily 
Stock return variance over the 200 days prior to the year-end. We include research and develop- 
ment expense (RD) as a proxy for the level of information asymmetry (Aboody and Lev 2000) 
because analysts have relatively stronger incentives to follow firms with higher levels of informa- 
tion asymmetry (Barth et al. 2001). The earnings-return (Pearson) correlation (CORR) between 
ROE and annual stock returns in the preceding four quarters captures the difficulty in predicting a 
firm's earnings. In addition, ROA controls for firm profitability. Finally, annual change in EPS 
(ACHEPS) controls for the magnitude of the forthcoming earnings information (Ali et al. 2007). 
All other variables are as defined earlier. 

Table 6, Panel A presents a comparison of the levels of and changes in the three main analyst 
variables in the year following first-time CSR disclosures. Initiators are covered by more analysts 
than non-initiators (COVERAGE: 26.08 for initiators versus 15.72 for non-initiators, p « 0.01), 
and achieve greater improvement in forecast accuracy than non-initiators (МЕЕ: —0.137 for 
initiators versus 0.120 for non-initiators, though at a more marginal statistical significance level 
with p = 0.07). 

We present the multivariate regression results for Equations (5), Equation (6), and Equation 
(7) in Panels B, С, and D of Table 6, respectively. Column I of Panel B shows that there is a 
significantly positive coefficient on DISCI * HIPERFORM (coeff. = 1.052, p = 0.05), which 
suggests that analyst following increases for initiators with superior CSR performance. When we 
run the regression separately in the two subsamples portioned based on industry medians of CSR 
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performance, we find a significantly positive coefficient on DISCI (coeff. = 0.904, p = 0.03) but 
only in the better performing group.”! 

We obtain similar results for the absolute magnitude of analyst forecast errors. Table 6, Panel 
C demonstrates that, while in the full sample, the coefficient on DISCI * HIPERF ORM is insig- 
nificant, the main effect of DISCI in the partitioned sample regression is marginally negative 
(coeff. = —0.251, p = 0.08) for the better performing group. The results for forecast dispersion, 
presented in Table 6, Panel D show a similar pattern. In the full-sample regression in Column I, 
the coefficient on DISCI ж HIPERFORM is significantly negative (coeff. = —0.053, p = 0.04). 
The regressions in the partitioned sample (Columns П and III) yield a, significantly negative 
coefficient on DISCI only among better performing firms (coeff. = —0.048, р < 0.01). 

In sum, voluntary CSR disclosure is associated with increased analyst coverage, improved 
forecast accuracy, and a reduction in forecast dispersion among firms with relatively superior CSR 
performance. These results are consistent with our conjecture that CSR disclosure by strong CSR 
performers helps reduce information asymmetry between managers and sliareholders and among 
shareholders. The evidence supports our reasoning that CSR disclosure can reduce the cost of 
equity capital by reducing estimation risk in the market. In other words, corisistent with our earlier 
evidence that the cost of equity capital benefit manifests only among firms with relatively superior 
CSR performance, the effects of CSR disclosure on analyst coverage and, forecast accuracy and 
dispersion are significant only when firms achieve relatively superior CSR. performance. 


CSR Disclosure and Subsequent Equity Issuances 

As discussed previously, we predict that firms anticipating external financing needs are more 
likely to initiate CSR disclosures in the hope of obtaining cheaper capital. Hence, we should 
observe more equity issuances after first-time CSR disclosures. We estimate the following logistic 
regression for Equation (8) and OLS regression for Equation (9) to empirically test this prediction: 


log[prob(SEO; „т)/(1 ~ prob(SEO;,,.7))] = Bo + B1DISCI,;, + БМВ;  F5LNSALES,, 
+ ByROA;,+ В5ГЕУ, „+ BgCASH; + B7FIN,, 
+ BgPAYOUT,,, + BCAPITAL;, + ВР), 
+ BuLNDISP,, + б, | (8) 


ISSUES; nr= By + ByDISCI,, + B)MB;,  B3LNSALES;,  B4ROA; + BLEV; + [BgCASH;, 
+ В,ЕІМ, + BPAYOUT, + BCAPITAL,, + BiRD;,* BuLNDISP,,* вы, (9) 


where Т (— 1 or 2) denotes one or two years following CSR disclosure, and,other notations follow 
those of earlier regression equations. SEO,,, (SEO,,;) equals 1 if a firm conducts a seasoned 
equity offering within one (two) year(s) following CSR disclosure. ISSUES,,, (ISSUE$,,2) is the 
total dollar amount in billions raised in SEOs within one (two) year(s) following CSR disclosure. 
We obtain information on SEOs from the Security Data Corporation (SDC). 

Following prior studies, we control for other potential factors affecting the equity issuance 
decisions of firms. We include the market-to-book ratio (MB), as Stein (1995) suggests that firms 
will choose the time when their stock is overvalued to issue equity. In addition, growth firms have 


?! A careful examination of the distribution of ACOVERAGE reveals that there are some relatively large values on the 
negative tail. To determine whether our results are sensitive to these large values, we exc:nde the bottom 5 percent of 
ACOVERAGE and get a more symmetric distribution on this variable. We also try excluding extreme values of the upper 
and the bottom 2 percent of this variable. Overall, our main results are not sensitive to the trzatment of the large negative 
values of ACOVERAGE. ! 
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a greater need for capital. We include research and development expenses (RD) and capital ex- 
penditures (CAPITAL) as additional proxies for growth opportunities. The likelihood of an equity 
issuance can depend on the extent of the financial constraints that a firm faces. Also, firms may 
follow a specific pecking order in their choice of financing options and rely preferentially on 
internal reserves and debt financing before issuing equity (Myers and Majluf 1984). Hence, we 
include profitability (ROA), cash flow (CASH), payout ratio (PAYOUT: cash dividend) and lever- 
age (LEV) to capture financial constraints, internally generated funds, and debt capacity. We 
control for analyst forecast dispersion (LNDISP: the logarithm of the standard deviation of analyst 
forecasts divided by the consensus forecast) as a proxy for the degree of agreement between 
management and investors because Dittmar and Thakor (2007) argue that firms are more likely to 
issue equity when the level of agreement is high. Finally, we control for firm size (LNSALES: 
logarithm of total sales) and financing activities (FIN) already conducted in the current year. 

Equation (8) assesses whether CSR disclosure is related to the likelihood of future equity 
issuance through SEOs, and Equation (9) examines whether CSR disclosure is associated with the 
size of SEOs. Table 7, Panel A indicates that firms are more likely to seek equity capital through 
SEOs in the two years following CSR disclosure. The coefficients on DISCI are positive and 
marginally significant (coeff. = 0.504, p = 0.09 for SEO,,,; coeff. = 0.625, p = 0.06 for SEO,,;). 
Based on the coefficient estimate for DISCI in Column 1 (ID), in the first year (two years) after 
CSR disclosure, the odds that initiators will issue equity is 65.5 percent (86.8 percent) higher than 
that of non-initiators. 

Table 7, Panel B reveals that, holding other factors constant, disclosing firms not only are 
more likely to issue equity, but also raise a significantly larger amount than non-initiators. The 
difference ranges from U.S. $165 million (J/$SUE$,,;) to U.S. $173 million (ISSUE$,,,). These 
results are consistent with managers initiating CSR disclosures before going to the capital market 
in anticipation of obtaining cheaper external capital and increasing their capacity to raise external 
capital, 


Additional Analyses 


Alternative Measures of CSR Disclosure 

Our CSR disclosure measure, DISCI, which captures first-time reporters, best serves the 
purpose of testing our hypotheses. However, for the sake of completeness, we also test our 
hypotheses using alternative measures of CSR disclosure. The first measure, DISC;,, is an indi- 
cator variable that equals 1 if Firm i discloses a standalone CSR report in Year t, and 0 otherwise. 
The second measure, DISCN;,, indicates whether a firm only sporadically publishes CSR reports. 
DISC;, takes a value of 1 if Firm i issues a CSR report in year г, but not in year t+1 (even though 
it resumes reporting in year #+2 ог later), and 0 otherwise. Allowing for non-first-time disclosing 
years significantly increases our sample size. We obtain results qualitatively similar to those based 
on DISCI. Finally, after excluding first-time reports, we use only continuing reports, DISC;,, 
which can be just mundane duplications of earlier reports containing less incremental information. 
We find a much weaker, albeit still significant, result. 


Individual Measures of the Cost of Equity Capital 


In the above analyses, we use the average of the three cost of equity capital measures based 
on Gebhardt et al. (2001), Claus and Thomas (2001), and Easton (2004). These estimation meth- 
ods are based on different earnings growth assumptions and therefore have distinct strengths and 
weaknesses. The merit of each measure is debated among researchers, and it is not our intention 
to resolve this debate. Averaging across the three estimates potentially reduces noise in individual 
measures (Larcker and Rusticus 2010), and is widely used in the literature (Hail and Leuz 2006; 
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Dhaliwal et al. 2006). To provide assurance that our results are not sensitive to the choice of these 
measures, we repeat our analyses using the three measures separately and obtain similar results. 


Correlation between KLD Performance and CSR Disclosure l 

KLD performance scores (PERFORM) are correlated with the decision of firms to issue CSR 
reports for the first time (DISCI), but their correlation coefficients (Table 2, Panel B) are moderate 
at about 10 percent.” ? As discussed earlier, we control for CSR performance in all regression 
equations. To further alleviate the concern regarding the correlation between the KLD perfor- 
mance scores and CSR disclosure of firms and the potential impact of this correlation on our 
results, we conduct the following robustness analyses. First, we remove Ше transparency-related 
category, that is, “Corporate Governance," from the KLD performance ranking Scores, as this 
category contains a subcategory, "Transparency," which is a dimension that i is likely to reflect the 
CSR disclosure policy of firms. Our main inferences are unchanged. Second, we use the perfor- 
mance score of each KLD CSR category to measure firms’ social performance. Our main infer- 
ences are unchanged. Finally, we match each DISCI observation with a noón-disclosing firm that 
has the closest industry-adjusted KLD CSR performance score in the same;year and industry and 
run regression Equation (2). The coefficient on DISCI is significantly negative at the conventional 
level, suggesting that CSR initiators enjoy a subsequent reduction in the cost of equity capital. 


Alternative Measures of CSR Performance 


To determine if our results are sensitive to alternative measures of CSE performance, we use 
two additional measures. One measure is an indicator DJSI that equals 1 if а firm appeared in the 
Dow Jones Sustainability Index in any year during the period 2002—2007. and 0 otherwise. The 
other is an indicator СКО that equals 1 if a firm was on the “100 Best Corporate Citizens" list for 
2007 from the Corporate Responsibility Officer, and 0 otherwise.” We do! not consider year-to- 
year variation because of data constraint. It turns ош DJSI (CRO) is correlated with our KLD 
performance scores with a Pearson coefficient of 30 percent (23 percent). Using these two mea- 
sures in place of ће KLD scores produces qualitatively similar results. — ' 


V. SUMMARY AND CONCLUSIONS | 

We examine а potential benefit associated with the initiation of voluntary disclosure of CSR 
activities: a reduction in the cost of equity capital. We find that the likelihcod of a firm initiating 
standalone disclosure of CSR activities is associated with a higher prior year cost of equity capital. 
Firms with CSR performance superior to that of their industry peers enjoy а reduction in the cost 
of equity capital after they initiate CSR reports. Further, firms initiating: CSR disclosure with 
superior CSR performance attract dedicated institutional investors and analyst coverage, and these 
analysts achieve lower absolute forecast errors and dispersion following such disclosure. Finally, 
CSR disclosure initiators appear to exploit this potential benefit of a reducticin in the cost of equity 


22 Consistent with this low correlation, KLD provides the following information regarding how it rates the CSR perfor- 
mance of firms (see http://www.kld.com/research/methodology.html): “KLD researches the; social, environmental, and 
governance performance of corporations. KLD research relies on five distinct data source} to inform our ratings and 
analysis. Data are collected in a disciplined process from a wide variety of company, goverriment, and non-government 
organization and media sources. KLD tracks each company through more than 14,000 global media sources daily." 
These five distinct data sources include (1) direct communication with company officers; (2) a network of global ESG 
research firms that cover non-U.S. markets; (3) review of more than 14,000 global news sources; (4) public documents 
of companies, including annual reports and proxy statements; and (5) information obtained from government and 
non-government organizations including the U.S. Department of Labor, EPA, Human Rights Watch, OSHA, CANICOR, 
Ceres, ICCR, and DoD. Hence, it appears that the CSR reports of firms constitute only one of the numerous information 
sources employed by KLD. 

23 See http://www.thecro.com/. 
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capital. They are more likely than non-disclosing firms to conduct SEOs to raise capital in the two 
years following the disclosure. In addition, among firms conducting SEOs, CSR disclosure initia- 
tors raise a significantly larger amount of equity capital than non-initiators. 

This study adds to the voluntary disclosure literature by extending the traditional research on 
voluntary disclosure beyond the narrow focus of financial disclosure. Our analyses enhance our 
understanding of the rationales behind and the consequences of the recent trend in voluntary CSR 
disclosure. These results have important implications for companies, regulators, and investors. 

A few caveats are worth noting. Most of the control variables that we use in the CSR 
determination model are obtained from the standard voluntary disclosure literature. To the extent 
that CSR disclosure is distinct from other forms of voluntary disclosure examined in the literature, 
we may have missed important determinants of CSR disclosure. In addition, it is possible that we 
missed some reports on stale websites because of their lack of maintenance, which would add 
noise to our results. Also, we do not examine the content of the CSR reports. To the extent that the 
detailed information of these reports is not fully captured by the KLD scores, we fail to capture 
some important characteristics of CSR reports. Further, it is important to control for the other 
disclosure policies of firms when examining the impact of CSR disclosure. Our empirical proxies 
using management guidance and earnings quality may not be sufficient to capture these potential 
confounding effects. Finally, although the KLD rating is widely used in the management literature, 
a significant amount of future research is warranted to further establish its validity in measuring 
the social performance of firms. 

These caveats notwithstanding, we believe that our study opens various venues for future 
research. For example, CSR disclosure and performance could have a different impact on the cost 
of debt as debtholders have a payoff function different from that of equityholders. Further, the 
effect of CSR disclosure could be a function of differences in legal environment and institutional 
setting. Therefore, an international study could help us better understand CSR disclosure. Last, as 
mentioned previously, it would be worthwhile to investigate the effect of the information content 
of CSR reports on the valuation decisions of investors. 
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Max. 
Strength Actual Actual 
(Perfect Max. Mean 
Main Categories Sub-Categories Score) Strength Strength 


Community (1) Charitable Giving, (2) 7 5 0.189 
Innovative Giving, (3) 
Non-U.S. Charitable Giving, 
(4) Support for Education, (5) 
Support for Housing, (6) 
Volunteer Programs, and (7) 
Other Strengths 
Corporate Governance (1) Compensation, (2) 5 3 0.167 
Ownership, (3) Political 
Accountability, (4) 
Transparency, and (5) Other 
Strengths 
Diversity (1) Board of Directors, (2) 8 7 0.605 
CEO, (3) Employment of the 
Disabled, (4) Promotion, (5) 
Women and Minority 
Contracting, (6) Work/Life 
Benefits, (7) Gay and Lesbian 
Policies, and (8) Other 
Strengths 
Employee Relations (1) Health and Safety, (2) 6 5 0.292 
Retirement Benefits, (3) Union 
Relations, (4) Cash Profit 
Sharing, (5) Employee 
Involvement, and (6) Other 
Strengths 
Environment (1) Beneficial Products and 5 4 0.140 
Services, (2) Clean Energy, 
(3) Pollution Prevention, (4) 
Recycling, and (5) Other 
Strengths 
Human Rights (1) Labor Rights, (2) 3 2 0.004 
Relations with Indigenous 
Peoples, and (3) Other 
Strengths 
Product (1) Benefits the Economically 4 3 0.077 
Disadvantaged, (2) Quality, 
(3) R&D/Innovation, and (4) 
Other Strengths 
Total Strength The sum of all of the above 38 29 1.474 
seven main categories. 
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ABSTRACT: We investigate whether short sellers and analysts differ in their use of 
information that is predictive of future returns. We find that short interest is significantly 
associated in the expected direction with all 11 variables examined. In contrast, ana- 
lysts tend to positively recommend stocks with high growth, high accruals, and low 
book-to-market ratios, despite these variables having a negative association with future 
returns. We then investigate the profitability of using short interest in trading. We find 
abnormal returns (1.11 percent per month) from а zero-investment strategy that (1) 
shorts firms with highly favorable analyst recommendations (buy signal) but high short 
interest (sell signal), and (2) buys firms with highly unfavorable analyst recommenda- 
tions (sell signal) but low short interest (buy signal). Short interest, therefore, appears to 
capture predictive information that can be used by investors in trading against analysts' 
recommendations to increase returns. 


Keywords: short interest; analyst recommendations; fundamental analysis; arbitrage. 


Data Availability: Data are available from the sources identified in the study. 


I. INTRODUCTION 
cademic research provides extensive evidence that fundamental analysis can be used to 
A= abnormal returns (e.g., Frankel and Lee 1998; Piotroski 2000; Swanson et al. 2003). 
This evidence suggests there is a delay in the price discovery process, which has spawned 
research into the roles that financial analysts play in this delay (Bradshaw et al. 2001; Jegadeesh 
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et al. 2004, hereafter JKKL; Bradshaw 2004; Barniv et al. 2009). Analysts) recommendations are 
the end-product from an extensive analysis of information, and they affect market prices by stating 
a specific course of action that an investor should take (Asquith et al. 2005a; Malmendier and 
Shanthikumar 2007). JKKL 2004 provide.evidence, however, that analysts’ recommendations are 
positively associated with some accounting, valuation, and growth characteristics that have a 
negative association with future returns (e.g., low book-to-market, high shles growth, and high 
accruals). Analysts’ incentives to obtain investment banking business and to generate trading 
commissions are potential explanations for why they tend to over-recommiend these stocks (Lin 
and McNichols 1998; Barber et al. 2007) and, indirectly, why analysts might contribute to a delay 
in price discovery. 

Our study investigates how short sellers use publicly available information. Short sellers are 
regarded as particularly sophisticated investors under financial economic theory.’ Similar to ana- 
lysts, short sellers invest considerable time and resources in analyzing companies, but they face 
potentially different incentives. Because short sellers place their own capital at risk, they have 
strong incentives to fully use predictive information. Research on short sellers’ use of fundamental 
information is limited, but Dechow et al. (2001) find that short sellers use fundamentals-to- 
valuation ratios to identify stocks that are expected to realize negative future returns. We extend 
Dechow et al. (2001) by investigating how short sellers utilize 11 items of fundamental informa- 
tion identified by JKKL that are predictive of future returns. We also conduct ће JKKL analysis 
of analyst recommendations for our sample to allow a direct comparison pf information use by 
analysts and short sellers. Our investigation provides further evidence on the role of short sellers 
in capital markets and their ability to interpret publicly available information. While financial 
analysts are acknowledged for their information intermediary role, we investigate the possibility 
that short sellers can also serve as an information intermediary and thereby facilitate the price 
discovery process. 

Following the JKKL analysis of analysts’ information use, we rank firms into quintile port- 
folios based on the consensus analyst recommendation. In a corresponding manner, we rank firms 
into quintile portfolios by the level of short interest to reflect short seller beliefs about future 
returns. We then employ ordered logistic regression to examine how analysts and short sellers use 
the company-specific predictive information. We categorize the information into four general 
types: accounting, valuation, growth, and momentum. We find that short interest is strongly asso- 
ciated with all 11 information items іп the direction consistent with their empirical relation to 
future returns. This result indicates that short sellers are highly informed about how company- 
specific information is likely to affect future returns. In contrast, we confirm results in JKKL that 
analysts tend to positively recommend stocks with high growth, high accruals, and low book-to- 
market ratios, despite these variables having a negative association with future returns. When we 
examine recommendation revisions, we find similar evidence. 

Our finding that short sellers interpret information properly as it relates to future returns 
supports an information intermediary role for short sellers. Consistent with this role, the SEC 
(1999, 3) observes that short sellers can “айа to stock pricing efficiency because their transactions 
inform the market of their evaluation of future stock price performance" (emphasis added). 
Pownall and Simko (2005) provide evidence that short sellers can serve аз information interme- 
diaries when analyst following for a firm is low. In this situation, there are limited alternative 








Diamond and Verrecchia (1987) argue that only informed traders with strong beliefs that stock prices will fall in the 
near-term will choose to sell stock short. Their reasoning is based on the notion that the high costs of short selling drives 
out uninformed traders, so that open short positions reflect trades by more informed investors. Boehmer et al. (2008, 
491) comment that short sellers “occupy an exalted place in the pantheon of investors as rational, informed market 
participants who act to keep prices in line." 
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sources of guidance. Our tests of information use suggest that short sellers can fill a complemen- 
tary information intermediary role even when coverage by analysts is extensive.” 

To provide further evidence on short sellers’ effectiveness as information intermediaries, we 
test whether short interest provides value-relevant information about future returns beyond that 
provided by analyst recommendations and the 11 predictive variables. Short positions could pro- 
vide incremental information because short sellers have sources of information not considered in 
our models or because they adjust the weights they place on items of information as market 
conditions change. In a regression model explaining future six-month returns adjusted for 
characteristic-based portfolio returns (Daniel et al. 1997), we find that the coefficient on short 
interest is negative (as predicted) and statistically significant after controlling for the information 
in analyst recommendations and the 11 predictive variables. These results indicate that short 
interest provides incremental information about future abnormal returns that is orthogonal to the 
information provided by analysts and the 11 predictive variables. In addition, the significant 
negative coefficient on the consensus analyst recommendation indicates that more favorable rec- 
ommendations result in lower future returns after controlling for open short interest and the 11 
predictive variables. If we consider recommendation revisions, then its coefficient is insignificant 
and the coefficient on short interest remains negative and significant. 

Our analyses and the evidence from JKKL indicate that analysts sometimes provide favorable 
(unfavorable) recommendations for stocks with characteristics that are associated with negative 
(positive) future equity returns. Since analyst recommendations influence trading decisions and 
stock prices (Asquith et al. 2005a; Malmendier and Shanthikumar 2007), their recommendations 
can provide support for stock prices that have temporarily deviated from their fundamental values. 
We investigate if short interest can be used to identify such stocks. That is, we examine if investors 
can use short interest together with analyst recommendations to construct a portfolio that is likely 
to earn abnormal future returns. To our knowledge, ours is the first study to link these two 
investment signals—both from highly regarded capital market participants—to forecast future 
long-run returns. 

We use a large sample of monthly observations over the period 1994 to 2006 to test alterna- 
tive trading strategies. We first construct quintile portfolios based on the consensus analyst rec- 
ommendation and then examine abnormal returns from a trading strategy that invests long (short) 
in firms comprising the best (worst) recommendation quintile. We find that the hedge return is 
modestly negative at —26 basis points per month for our test period. This result is perhaps 
surprising given the high esteem placed on financial analysts within the financial community, but 
it is consistent with our results on information use and the growing literature that questions the 
investment value of analyst recommendations (Demirakos et al. 2004; Bradshaw 2004; Barniv et 
al. 2009). We employ the same sorting procedure to form investment portfolios based on levels of 
short interest, and we obtain a statistically significant hedge return of 56 basis points per month 
from selling short firms in the highest short interest quintile and buying firms in the lowest 
quintile. Interestingly, the monthly returns from buying stocks with low short interest (29 basis 
points) are similar to the returns from selling short stocks with high short interest (27 basis points). 


? Note that analysts also provide earnings forecasts, price targets, and narrative discussion that can be informative to 
investors—possibly more informative than their recommendations. 
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Concurrent research by Boehmer et al. (2010) also finds that significant abnormal returns can be 
earned by buying stocks with low short interest. 

We next examine whether abnormal returns can be improved by using information from both 
analysts and short sellers. Here, we intersect the analyst recommendation and short interest quin- 
tiles to produce 25 portfolios (5 X 5) formed using information from both signals. We find that 
monthly abnormal returns are insignificant for portfolios containing stocks about which analysts 
and short sellers strongly concur (e.g., least favorable recommendation and high short interest, or 
most favorable recommendation and low short interest). In contrast, returns are highly significant 
for portfolios of stocks in which they strongly conflict, if we trade consistent with the short sellers. 
Specifically, we find that an investor would obtain an average monthly abnormal return of 111 
basis points from a zero-investment strategy that (1) invests long in firms with the worst recom- 
mendations (sell signal) but the lowest short interest levels (buy signal), and (2) invests short in 
firms with the best recommendations (buy signal) but the highest short interest levels (sel] signal). 
The monthly abnormal return from this strategy is statistically significari in each sub-period, 
ranging from 71. basis points in 2004-2006 to 130 basis points in 1999-2003. Our dual-signal 
approach, therefore, provides the most investment value during the volatile 1999-2003 sub-period. 
By comparison, following analyst recommendations would cause investors} to experience sizable 
losses over this sub-period, and the returns to trading on short interest їп isolation would be 
statistically insignificant. 

Our study contributes to our understanding of short sellers by (1) documenting their efficient 
use of a number of predictive variables discussed in the academic literature, (2) showing that open 
short positions are incrementally useful in predicting future returns after|controlling for those 
predictive variables, and (3) showing that the returns to mimicking the trading of short sellers are 
much larger when conditioned on conflicting analyst recommendations. Collectively, these find- 
ings provide a more complete picture of how short sellers influence equity price formation and 
should be of interest to academics, investors, and regulators. This topic has assumed considerable 
importance due to the alleged role of short selling in the dramatic decline in stock prices that 
began with the 2008 credit crisis (Boehmer et al. 2009). Academics are also likely to be interested 
іп the implications of our empirical results for Miller's (1977) theory that binding short sale 
constraints cause pessimists to be under-represented in price formation, leading to overvaluation 
when a strong divergence of opinion exists about a stock. Our evidence shows that short sellers are 
under-represented in price formation whenever they disagree with analysts, regardless of whether 
they are the optimists or the pessimists. 

Section II describes the selection of our sample. In Section Ш, we investigate the relation of 
short interest and analyst recommendations with information that has been shown by prior re- 
search to predict future returns. Section IV presents returns from trading strategies that use analyst 
recommendations, short interest, or both of these signals together. This section also reports the 
results of several robustness tests. We discuss our results and conclude in Section V. 





П. SAMPLE SELECTION AND DESCRIPTIVE STATISTICS 
To perform our analysis, we require time-series data on analyst recommendations, open short 
positions, the 11 predictor variables, and stock returns. We obtain analyst recommendations from 





This finding complements earlier studies that find that stocks with high short interest have significant negative abnormal 
returns (e.g., Asquith and Meulbrock 1995; Asquith et al. 2005b; Desai et al. 2002). Notably, abnormal returns from 
trading on high short interest are not necessarily indicative of market inefficiency due to limits to arbitrage. In contrast, 
the finding that stocks with low short interest have significant positive returns is clearly inconsistent with market 
efficiency since buy-and-hold strategies are not subject to similar constraints (see Boehmer et al. [2010] for further 
discussion). 
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the Thompson Financial I/B/E/S Recommendations database. Beginning in late 1993, I/B/E/S 
provides analyst recommendations for a wide cross-section of firms. /В/Е/5 codes recommenda- 
tions into five ordered categories: strong buy = 1; buy = 2; hold = 3; sell = 4; and strong sell = 
5. For analyses using recommendations, we reverse this coding (i.e., strong buy — 5; strong sell — 
1) to allow for a more intuitive interpretation of our results. Each month, we calculate the con- 
sensus recommendation level (Rec) as the mean of all outstanding recommendations issued a 
maximum of 12 months prior to month-end.^ We use only the most recent individual analyst 
recommendation in the calculation. We also require that the individual analyst recommendations 
be issued on or before the I/B/E/S consensus recommendation date.? 

In the first set of analyses, we conduct separate tests of analysts’ and short sellers’ use of 
information that is predictive of future returns. We perform these tests using analyst recommen- 
dations and short interest as of the last month in each calendar quarter, consistent with JKKL.Ó We 
include recommendation revisions in these analyses, given that prior research finds that recom- 
mendation revisions might be better indicators of future stock price performance than recommen- 
dation levels (Womack 1996; JKKL; Barber et al. 2010). We calculate recommendation revisions 
as the change in recommendation levels from calendar quarter t-1 to quarter 2 (i.e., consecutive 
quarters). An increase (decrease) in the consensus recommendation indicates an upgrade (down- 
grade) in the stock relative to the previous calendar quarter 1. 

We obtain short interest data to correspond with our analyst recommendations sample period. 
The Compustat Monthly Securities Database contains monthly short interest for all firms listed on 
U.S. exchanges beginning in 2003. For earlier years, we purchased monthly short interest data 
directly from the NYSE, AMEX, and NASDAQ exchanges, and from an online independent 
vendor.) The stock exchanges report open short positions using the 15th of each calendar month as 
the settlement date (or the last business day before the 15th). We scale short interest by the number 
of shares outstanding as reported by CRSP and label the resulting ratio as 51ғайо, which is 
шш in the literature (e.g., Dechow et al. 2001; Asquith et al. 20055; Pownall and Simko 
2005). 


This requirement helps alleviate concerns that our results are being unduly influenced by stale recommendations, and it 
is similar to the measure used by JKKL. Thompson Financial claims that recommendations not updated for 180 days аге 
excluded from the /B/E/S consensus recommendation (see Thompson Financial 2009,11); however, we are uncertain as 
to how long this policy has been in place. Given our long sample period, we follow procedures implemented in prior 
research. 

More specifically, I/B/E/S calculates the consensus recommendation on the Thursday before the third Friday of every 
month (ranging from the 14th to the 20th day of the month). The requirement of excluding recommendations that are 
issued after this date results in an average delay of 13.7 days between the time when the consensus recommendation is 
calculated and the beginning of the returns accumulation period. This methodology serves two purposes. First, short 
interest data are made publicly available mid-month and therefore, both signals —recommendations and short interest 
are obtained at approximately the same time during the month. Second, the delay ensures that investors are given ample 
time to process and impound in price whatever new information is contained in both signals. Thus, we purposely 
exclude from our tests any drift in stock prices that occurs due to the public disclosure of the signal. In this manner, our 
methodology differs markedly from the daily rebalancing requirements employed in papers such as Barber et al. (2001) 
and Barber et al. (2010). 

Performing these analyses using quarterly data is intuitive given that the majority of the predictor variables (seven of the 
11) change on a quarterly basis as financial information is disclosed. We find that all inferences are the same when we 
perform the analyses using monthly data. 

Ljungqvist et al. (2009) provide evidence that the I/B/E/S recommendations database contains systematic errors in the 
pre-2007 files that is likely to overstate the investment value of analysts' recommendations. Our study is among the first 
to re-examine the investment value of analysts' recommendations using the cleaned 2007 database. 

Data from the online vendor, shortsqueeze.com, provide less than 1 percent of the total observations. The data cover a 
period in which we were unable to obtain short interest directly from the NASDAQ. We compared shortsqueeze.com 
data to that from a six-month period for which we already had short interest data from NASDAQ. The only differences 
were due to shortsqueeze.com rounding their data to the nearest hundredth place. 

In the "Robustness Tests" section, we examine the sensitivity of our results to deflating by lagged trading volume. 
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Similar to JKKL, we select a set of 11 predictor variables for our analysis and winsorize each 
of the predictor variables at the 2.5 and 97.5 percentiles to control for outliers. We group the 
predictor variables into one of four classifications based on the nature of the variable (see the 
Appendix for details on the calculation of each variable). The first group, labeled Accounting, 
consists of earnings surprise (SUE), total accruals (TACCR), and capital expenditures (CAPEX). 
The Valuation group consists of the market-value-of-equity (MVE), earnirigs-to-price ratio (EP), 
book-to-market ratio (ВТМ), and the average daily stock turnover (TURN). The Growth group 
consists of realized sales growth (SG) and forecasted long-term growth (270). The fourth group, 
Momentum, consists of earnings forecast revision (FREV) and price momentum (MOM). These 
variables have been shown in prior research to be associated with future returns (see the Appendix 
for specific citations). Thus, we expect that sophisticated capital market participants, such as 
analysts and short sellers, would use information embedded in these variables when establishing 
their positions. 

Finally, we obtain monthly returns data from CRSP to compute future six-month buy-and- 
hold abnormal returns (ARETÓ) using characteristic portfolio-matching.’° Specifically, we define 
abnormal returns as the raw buy-and-hold return adjusted for the portfolio return from 125 bench- 
mark portfolios formed based on size, book-to-market, and momentum (5 X 5 X 5), as described 
in Daniel et al. (1997). We use a holding period of six months for consistency with JKKL, but test 
the sensitivity of our results to alternative holding periods as a supplemental analysis. 

Our final sample, resulting from the intersection of Compustat, CRSP.|I/B/E/S, and our short 
interest database, consists of 80,674 firm-quarter observations over the 52 calendar quarters from 
1994 to 2006. For our main analyses, we rank firms into quintile portfolios based on analyst 
recommendations (both levels and changes) and short interest in each calendar quarter 7. Thus, we 
rebalance the portfolios quarterly. For recommendation changes, we ensure that firms without a 
recommendation revision are included in the middle quintile. 

Table 1, Panel A presents descriptive statistics for the variables used in our analyses. The 
mean (median) value for Rec of 3.76 (3.79) indicates that the average analyst recommendation is 
only moderately less than a “buy” (which would be coded 4). A narrow intarquartile range of 0.34 
(—0.20 to 0.14) for the consensus recommendation change, ChgRec, shows that analyst recom- 
mendations are generally sticky. Nevertheless, the minimum and maximum values for ChgRec 
indicate that analysts occasionally downgrade a stock all the way from strong buy to strong sell, 
and vice versa. The mean short interest ratio, Sfratio, is 3.2 percent, which is considerably larger 
than the median of 1.8 (due to some large values, as indicated by the maximum of 23.5 percent). 
The mean six-month abnormal return (Aret6) is 1.0 percent, but the mediar is only —1.9 percent. 

With respect to the 11 predictor variables, we find that earnings surprise (SUE) has a mean of 
zero but a slightly positive median of 0.002, consistent with most firms reporting earnings that 
meet or beat the current analyst forecast. Total accruals (TACCR) are negative, on average, due to 
deducting depreciation. Capital expenditures (CAPEX) average approximately 6.3 percent of as- 
sets. We find that firm size (MVE) is highly skewed, with a mean of $3,578 compared to a median 
of $775 (in millions). The average earnings-to-price ratio (EP) is only 2.9 percent due to some 
negative values (median 4.3 percent). The book-to-market ratio (ВТМ) has а mean of 0.50 (median 
0.42), consistent with prior research. Approximately 0.58 percent of а firri's shares turn over on 
any given day (TURN). Realized sales growth (SG) averages 17 percent, and analysts' long-term 
earnings growth forecasts (LTG) average 17.55 percent. Analysts' forecast revisions (FREV) have 





10 1f a firm delists during the return accumulation period, we compound the delisting return with the buy-and-hold return 
and assume the liquidating proceeds are reinvested in a portfolio that earns a normal return for the remainder of the 
period. 
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a mean of zero but a slightly positive median of 0.002. Price momentum (MOM) averages 8.4 
percent for the preceding six months (median 4.7 percent). 

Table 1, Panel B reports mean analyst recommendations, short interest, and market returns 
(using the S&P 500 portfolio) for each year from 1994 to 2006. From 1994 to 2000, we observe 
a monotonic increase in the average analyst recommendation, which peaks at 4 in 2000. The 
average recommendation then declines in years 2001 through 2003, and remains at a lower level 
through the end of our test period. This shift corresponds with criticism of analysts that led to the 
Global Research Analysts Settlement, NASD 2711, and NYSE Rule 472. One line of criticism 
focused on analysts' conflicts of interest, including their incentive to maintain a positive relation 
with corporate managers in order to generate investment banking business and to obtain earnings 
guidance. 

Table 1, Panel B also reports another noteworthy change over our test period. The mean level 
of short interest is around 2 percent from 1994 to 2000. The level then increases appreciably over 
the next six years, reaching 5.7 percent in the final year of our sample period. This shift, which 
corresponds with a dramatic increase in the number of hedge funds, increases the importance of 
research that furthers an understanding of the role of short selling in the price formation process. 
Note that shifts over time have a minimal effect on our results because we rank firms into quintiles 
based on their relative values at a given point in time. 


Ш. PREDICTIVE INFORMATION USED BY ANALYSTS AND SHORT SELLERS 
In this section, we first examine whether analyst recommendations and short interest incor- 
porate fundamental and other information in the manner shown by prior research to be predictive 
of future returns. We then investigate whether analyst recommendations and short interest provide 
information that is incremental to that information. 


Univariate Evidence 


Table 2 presents mean values for each of the 11 predictive variables by quintile for recom- 
mendation levels, recommendation changes, and short interest levels. The quintiles correspond to 
portfolios that we later use in trading analyses. In Panel A, as we move down each column from 
the worst to the best recommendations, we find a monotonic (or near monotonic) increase for eight 
of the 11 variables. The increase for SUE, EP, FREV, and MOM is consistent with analyst 
recommendations properly incorporating the relation of these measures with future returns. In 
contrast, the increase for TACCR, CAPEX, SG, and ГТС indicate that analysts misuse this infor- 
mation, which could cause more favorable recommendations to portend lower investment returns. 
The overall pattern of information use indicates that analysts tend to issue more favorable recom- 
mendations for glamour stocks, even though prior studies show that these stocks earn lower 
subsequent returns (Lakonishok et al. 1994; La Porta 1996; Sloan 1996; Beneish et al. 2001). 
Examining changes in recommendations, Panel B shows a clear pattern for only three variables; 
but in each case, the change is consistent with the relation of the information with future returns 
established in prior research. Specifically, as we move down the columns from downgrades to 
upgrades, we observe a monotonic increase for earnings forecast revisions (FREV) and stock price 
momentum (MOM), and a monotonic decrease for long-term growth (LTG). While prior research 
has generally found that recommendation revisions are better predictors of future returns than are 
recommendation levels, this analysis indicates that recommendation revisions fail to incorporate 
eight of the 11 items of predictive information. These results for recommendation levels and 
changes are similar to the results documented in JKKL. 

Panel C of Table 2 provides the corresponding analysis for short interest quintiles. The 
book-to-market variable (BTM) decreases monotonically as the level of short interest increases, so 
short sellers tend to take greater positions in firms with a higher market value relative to their book 
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value. Three variables increase monotonically as the level of short interest increases: capital 
expenditures (CAPEX), stock turnover (TURN), and sales growth (SG). In each case, the pattern of 
information use is consistent with their relation to future returns documented in prior research. 
Comparing the high and low short interest quintiles for the other variables shows that short 
positions are generally consistent with how the variables map into future returns, but the relation 
is not monotonic. 


Multivariate Evidence 


We next use ordered logistical regression analysis to provide a multivariate test of the relation 
between analyst and sbort seller investment signals and the 11 predictor variables. In all regression 
analyses, we assess statistical significance using test statistics based on standard errors that are 
adjusted for two-way clustering of residuals by firm and calendar month (Petersen 2009; Gow et 
al. 2010). Table 3, Panel A reports results using analyst recommendations and recommendation 
revision quintiles as the dependent variable, with quintiles coded from 1 to 5.1 

For recommendation levels, we find that analysts correctly incorporate the implications for 
future returns of only one of the Accounting variables: unexpected earnings (SUE). Analysts 
favorably recommend firms with high total accruals (TACCR), and do not consider capital expen- 
ditures (CAPEX), despite evidence that increases in those accounting measures are associated with 
lower future returns (Sloan 1996). Examining the Valuation measures, analysts correctly favor 
smaller firms (LNMVE) and those with a higher earnings-to-price ratio (EP). However, they also 
favor firms with a low book-to-market ratio (BTM) and high growth (SG, LTG), despite evidence 
that stock prices of such firms underperform the market. Examining the Momentum variables, 
analysts correctly favor firms with high earnings momentum (FREV) and stock price momentum 
(MOM). 

The results for revisions in analysts' recommendations are reported on the right side of Table 
3, Panel A. Examining the Accounting variables, we find that all three variables (SUE, TACCR, 
and CAPEX) are statistically significant, but with the unexpected sign. For the Valuation and 
Growth variables, the evidence is mixed: EP and TURN are statistically significant in the expected 
direction, but ВТМ is statistically significant in the unexpected direction. The coefficient on LTG 
is also significant in the unexpected direction. For the Momentum variables, we find that both 
MOM and FREV are statistically significant in the expected direction. Finally, we find that rec- 
ommendation changes are negatively associated with past recommendation levels. This result 
makes intuitive sense because the highest (lowest) recommendations can only be revised down 
(up). 

Considering the types of information used by analysts in both their recommendations and 
recommendation revisions, analysts' correctly favor stocks with positive price momentum (MOM), 
positive earnings momentum (FREV), and high earnings-to-price (EP). They incorrectly favor 
stocks with high forecasted growth (LTG), high accruals (TACCR), and low book-to-market value 
(BTM). Thus, financial analysts view higher past and future growth and higher accruals as positive 
features in recommending stocks, despite research that shows the opposite relation (Lakonishok et 
al. 1994; La Porta 1996; Sloan 1996). In addition, analysts also tend to issue more favorable 
recommendations for firms with low book-to-market ratios, even though prior research shows a 
positive association with subsequent returns (Fama and French 1992). This evidence indicates that 


11 Note that quintiles are of approximate equal size (after adjusting for ties and including all recommendation revisions of 
zero in the middle quintile). Due to the low frequency of strong sell and sell recommendations issued by analysts, the 
most unfavorable recommendation quintile contains some “hold” recommendations. 


The Accounting Review January 2011 
American Accounting Association 


(о8ға 1x2u uo рәпициоо) 


January 2011 





"Drake, Rees, and Swanson 


112 








8600 8510 za Opnesq 
gk S8 25601 9260- Зәм 2940 DVI 
een VO 11, 6670 жәж09766 £LV'O sod WOW 
жж 16 9L 041$ 3k 89 €TT #S7'8 sod АЯМА 
шшиэшор] 
gap £L 19 с10'0 ere 8200 1700 Зәм оп 
сто S100— кажу ETC 18L'0 Зәм DS 
чмо) 
+0186 61Є`0— LTO 2:00- ZƏN NUL 
жжжбі/16 0%170- 560611 vILO- sod AUS 
dex E OC %9С0 gop EL’ £L 8/5 sod d4 
РЄТ 8000 ae RU v£0'0— Зәм ЗАЙЫП 
UOHDhIDA 
ao OP'S LEYO 00'0 9000 Зәм XddVO 
жесе? уто ee OT TZ 7660 Зәм YOOVE 
see I8 €T £PL'0— xp OL 61 1681 Sod ANS 
Suguno22y 
arenbs-1q9 309121130) arends- PPYP зограла 9[QELIEA. 
$ә8ивцгу uorepuouruo323 SPAINI uonepuoeunuooow 


(пол5вадво у 21618071 рәләріо 8ш) зэЗаецо pue $элэТ поперизшшозоу Zurure[dxq сү pued 
5299$ 3204$ рив Ави Aq uonuuriogu] запотрала Jo asp 
$ AISVL 





The Accounting Review 
American Accounting Association 


113 


Should Investors Follow the Prophets or the Bears? 


sx Sc © 
ee LT ST 


жжж997 ET 
#490 SV 


een SL VBP 

xxx OP УТ 

жжж?О BI 
#86 


see 99 0€ 
ж £97 IC 
sek tO IT 





oeaenbge-njj 





з$әләўи] 31045 


c£€0 


9600- 
86Г<- 


?І090 
О0О 


86ГЎ 
SycO0— 
8690- 
0800 


PLET 
LEST 
LLET- 


}чәтәцдәог) 





“puou puoro pue шлу Aq spenpisor Jo Sunrejsnpo Авл\-оОА\ Joy paysnipe әле yey, зоо prepugjs 
шо рэзеа sonsneis 1523 130d91 oA, "Цолеэзал юра ш рәуесірі se sumyar omjnj pue [quite Axojeuejdxo ei uaoAjoq поцвүә: ројотраза о suodor иштоо ‚доїрәлд„ әчү, 'ројејпојео 
SI әгдецел цовэ moy јо suoneuejdxoe po[mjep Joy xipueddy әш pue *ojqeureA yous jo suondyosep јој | ejqe] әә$ "Auousred 107 sidoo1ojur әш 110do1 jou ор әл ‘SUJI эми} Jo 
әлпотрәді oq о) UMOYS зоавыел 11 по (31807 рәләрло Sursn) possa13o1 ore sjuoumusisse enurmb вә POS pue попериәшшоәәз ,sjs&peue пәцл\ sjjnso1 uoneurnso syodal әде; SUL 
‘злоиепЬ-шау р/0%8 = U 
1521 рәпед-ома € 3ursn *[oAn2adso: ‘soaa, 1070 PUE “600 OTO = 2 IP W 22082910815 [оце] јетри] а kk 


SON 
Зәм 


sod 
Sod 


Soq 
Зәм 
Зәм 
sod 


sod 
sod 
Зәм 


зрэд 


za Opnesq 


NOK 


Adda 
шпмәшор| 


DIT 
05 
фолк) 


NJAL 
WIE 
4З 
ЗАЙМ 
UONONIDA 


XddVO 
ЧО 


305 
Зиципоооү 


чел 


(потввалда у опао рәләріо) 801511) збэлэзат 3104$ Zururepdx;g :9 joueg 


January 2011 


American Accounting Association 


eview 


The Accounting R 


t 
114 Drake, Rees, and Swanson 


i 


sell-side analysts tend to favorably recommend “glamour stocks." IKKI reached the same con- 
clusion based on an analysis of an earlier time period. і 

Table 3, Panel B reports results from a model using short interest quintiles as the dependent 
variable. We find that all 11 variables are statistically significant with coefficient signs in the 
expected direction. Additionally, we note that the explanatory power of this model (Panel B, 
pseudo R? = 33.2 percent) is more than double that for the model using, recommendation levels 
(Panel A, pseudo R? of 13.8 percent) and more than triple that for the model using recommenda- 
tion revisions (Panel A, pseudo R? of 9.8 percent). ? Thus, consistent with'our univariate analysis, 
we find that short interest is explained better by the predictive information in Accounting, Valua- 
tion, and Growth variables than is analyst recommendation levels or changes. Our evidence is 
consistent with other studies that examine the association between short interest and indicators of 
future returns (Dechow et al. 2001; Cao et al. 2007; Seybert and Wang 2009). Our study extends 
this research by employing several predictive variables simultaneously in the same regression 
model, and by comparing results for short sellers to those for analysts. 


Incremental Information About Future Returns 


In this section, we investigate whether recommendations, recommendation changes, and short 
interest contain incremental information about future returns, beyond the information in the 11 
predictive variables. Using the methodology in JKKL, we convert the continuous predictor vari- 
ables into binary signals based on a median split. For all variables where the expected relation 
with future returns is positive (negative), the binary variable is coded 1 when its value is greater 
(less) than its median for a given quarter, and 0 otherwise. Thus, we expect a positive coefficient 
on all predictor variables. 

Table 4 presents the results from our analysis. The model on the let includes both recom- 
mendation level quintiles (QRec) and short interest quintiles (QSIratio). Each of these quintiles is 
scaled to range from 0 to 1 to facilitate interpretation of coefficients. The coefficient on QRec 
indicates that analysts’ recommendations are incrementally informative about future abnormal 
returns, but the coefficient is negative. This result suggests that buy-and-3old investors would do 
better to trade against the consensus analyst recommendation. In contrast, the coefficient on 
QSiratio is significantly negative, indicating that short interest provides incremental information 
for predicting future returns even after controlling for the information contained in the 11 predictor 
variables. The negative sign indicates that, as would be expected, a higher (lower) level of short 
interest is associated with lower (higher) future abnormal returns. The coefficient magnitude for 
QSIratio of —0.029 can be interpreted as the six-month return earned on an investment portfolio 
that is formed optimally to exploit the information in short interest that is orthogonal to the 
information in analysts' recommendations and the predictive variables. Thus, while short sellers 
use the information contained in the investment signals (as indicated by the significant associa- 
tions reported in Table 3), the results in Table 4 show that short sellers also develop information 
to predict future returns that goes beyond what is contained in those variables. Examining the 
predictor variables, we find significant coefficients in the expected direction for TACCR, SG, and 
FREv."* 


12 Note that the explanatory variables have the opposite predicted sign in the short interest model (compared to the 
recommendation models). 

P Since the dependent variables differ across models, it is not possible to test for differences in explanatory power. 
However, given that we have standardized the dependent variables by ranking them into quintiles, their variation is 
similar. Specifically, the standard deviations of the quintile ranking of analyst levels, analyst changes, and short interest 
are 1.41, 1.38, and 1.41, respectively. Thus, we believe a comparison of pseudo RÁ i is informative. 

М The coefficient for TURN is marginally significant with the wrong sign. This appears to be driven by its high correlation 
with short interest (greater than 50 percent). When О51ғайо is excluded from the model, te coefficient for TURN is not 
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TABLE 4 


Incremental Information about Future Returns Provided by Recommendations, 
Recommendation Revisions, and Short Interest 


Recommendations Recommendation Changes and 

and Short Interest Short Interest 
Variable Coefficient t-stat Coefficient t-stat 
Intercept 0.009 0.70 --0.003 —0.31 
QRec —0.019 —2.55** 
QChgRec 0.003 0.59 
Qsiratio —0.029 —39]*** —0.027 —3,7]*** 
DSUE 0.004 0.79 0.003 0.63 
DTACCR 0.025 7.10 0.025 7,23%%% 
DCAPEX 0.004 0.75 0.004 0.72 
DLnMVE 0.003 0.72 0.003 0.68 
DEP — 0.003 —0.45 —0.004 —0.54 
DBTM 0.005 0.93 0.007 1.19 
DTURN —0.012 —1.85* —0.012 —1.80* 
DSG 0.015 2,43%% 0.016 2.66 FF 
DLIG —0.009 — 1.08 —0.007 —0.85 
DFREV 0.011 2.05** 0.009 1.75* 
DMOM 0.009 1.13 0.007 0.94 
Adj-R? 0.003 0.003 


ж жн ЖЖЖ Indicate statistical significance at the а = 0.10, 0.05, and 0.01 levels, respectively, using a two-tailed test. 


= 80,674 firm-quarters. 


This table reports estimation results when future six-month abnormal returns (calculated as in Daniel et al. [1997]) are 
regressed on analysts’ recommendations and short interest data along with 11 variables that prior research shows to be 
predictive of future returns. QRec is the quintile assignment based on recommendation levels. QChgRec is the quintile 
assignment based on recommendation revisions. OS/ratio is the quintile assignment based on short interest. QRec, 
QChgRec, and 051ғайо are scaled to range between 0 and 1 (0.00, 0.25, 0.50, 0.75, 1.00) to facilitate the interpretation of 
the coefficients. Using the methodology in Jegadeesh et al. (2004), we convert the continuous predictor variables into 
binary signals based on a median split. For all variables where the expected relation with future returns is positive 
(negative), the binary variable is coded 1 when its value is greater (less) than its median for a given quarter, and 0 
otherwise. See Table 1 for descriptions of each variable, and the Appendix for detailed explanations of how each variable 
is calculated. We report test statistics based on standard errors that are adjusted for two-way clustering of residuals by firm 
and calendar month. 





The model on Ше right side of Table 4 provides a similar analysis for recommendation 
revision quintiles (QChgRec) and short interest (QS/ratio) quintiles. The coefficient on QChgRec 
is positive as expected, but is not significantly different from zero, suggesting that recommenda- 
tion revisions do not provide information about future returns that is incremental to short interest 
and to the other publicly available investment signals. The coefficient on QSIratio is again sig- 
nificantly negative. The coefficients and significance levels for the predictor variables are similar 
to those reported in the recommendation level regressions.!° 


statistically significant (t-stat = —0.52). In contrast, QSIratio remains highly significant when TURN is excluded from 
the model. 

We also estimated a regression equation that includes QRec, QChgRec, and QSlratio together with the other 11 
predictive variables. Results from this regression are qualitatively equivalent to what is reported in Table 4, except that 
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| 
We draw the following general conclusions from the results presented in Table 3 and Table 4. 
First, analysts’ recommendations do a poor job of incorporating information about future returns 
provided by the predictor variables, and buy-and-hold investors woul: actually do better by 
trading against analyst recommendations. Second, analyst recommendation revisions also do a 
poor job of using predictive information, and they do not contribute information beyond what is 
contained in the predictor variables and short interest. Third, short sellers correctly incorporate 
publicly available information that is predictive of future returns and, furthermore, short sellers are 
able to generate information that is orthogonal to the set of predictive variables we use in our 
analysis. In the next section, we explore the success of trading strategies that are designed to 

exploit the above results. 


IV. INVESTMENT PERFORMANCE BASED ON ANALYST RECOMMENDATIONS 
AND SHORT INTEREST 

The results reported in Section III suggest that investors might improve their returns by using 
short interest as a supplementary investment signal. Indeed, many of tie associations between 
analyst recommendations and variables that are predictive of future returns suggest that analysts 
might actually impede the price discovery process. Pownall and Simko #2005) provide evidence 
that short positions can help to fill an information intermediary gap for companies with low 
analyst following. Our results suggest that short interest can be used as an information signal for 
investors, regardless of the analyst following. The benefit of short interest is likely to be greatest 
when this signal contradicts the signal from analysts' recommendations. Specifically, the results in 
Table 4 suggest that investors would profit from a buy-and-hold strategy: that (1) trades against 
analyst recommendation levels and (2) trades with short interest. This conjecture is based on the 
signs of the statistically significant coefficients for analyst recommendations and short interest in 
Table 4.16 

The trading strategies we consider are implementable and follow ih portfolio construction 
methodology outlined in Jegadeesh and Titman (1993).7 Under this methodology, the strategies 
hold a series of sub-portfolios that formed in the current month and in each of the previous five 
months (six-month holding period . Thus, we simulate a portfolio where a 1/6 fraction of the 
stocks are reassigned to portfolios each month. We rebalance the portfolios monthly to maintain 
equal weights on each security and calculate the mean abnormal return for each portfolio. This 
results in a time-series of monthly portfolio returns that is free of overlapp: ing return accumulation 
periods. As discussed in further detail below, we also calculate hedge portfolio returns that go long 
and short in particular portfolios. i 

Since the consensus analyst recommendation and the short interest ratio change each month, 
each portfolio is based on information from analysts and/or short sellers from the current month 
and from each of the past five months. In our initial tests, we examine the signals from analysts or 
short sellers separately. We then combine recommendations and short interest to develop a trading 
strategy that exploits the information contained in both signals. When combining the signals, we 
intersect the analyst recommendation and short interest quintiles to produce 25 portfolios (5 X 5). 


t 


the coefficient on QChgRec becomes marginally significantly positive. 

16 Tn this section, we do not report trading strategies that use recommendation revisions ђесазве results from these tests are 
generally consistent with our earlier analyses that show they contain no incremental infdrmation about future returns 
beyond short interest. | 

17 The Jegadeesh and Titman (1993) methodology avoids statistical problems associated with serial correlation induced by 
overlapping return accumulation periods. We thank an anonymous reviewer for suggesting this approach. 

18 In a later section, we discuss the sensitivity of the results to a shorter return window ОЁ опе month and the use of a 
four-factor model, which controls for the Fama-French risk factors and momentum. 
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We form our portfolios using all firm-month observations with available data for analyst 
recommendations, short interest levels, and stock returns. Thus, we drop the requirement that data 
are available for the 11 predictor variables, and we include every month (not just the last month of 
each quarter, as in our first set of analyses). These changes increase the sample size to 564,101 
firm-month observations and, as a result, increase the extent to which we can generalize results,! 
We report results for the full time period, 1994 to 2006, and for three sub-periods because 
documenting the stability of abnormal returns to a particular trading strategy over time is critical 
to the evaluation of the strategy. The first sub-period consists of the five-year period from 1994 to 
1998, which overlaps the 1985 to 1998 time period investigated by JKKL, and corresponds to a 
strong bull market. The second sub-period consists of the subsequent five years, from 1999 to 
2003, which includes the precipitous decline in the NASDAQ index, the adoption of Regulation 
Fair Disclosure, and the Global Settlement agreement reached between the SEC, NASD, NYSE, 
and ten of the largest investment firms. The final sub-period consists of the most recent years in 
our sample, 2004 to 2006. 

In Figure 1, we display six-month abnormal quarterly returns over the full test period for 


FIGURE 1 
Comparison of Six-Month Abnormal Returns to the Most and Least Favorable 
Recommendations 
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Consensus recommendations as of the last month of calendar quarter t are sorted into quintiles, with the highest 
quintile designated the most favorable portfolio and the lowest quintile the least favorable portfolio. Portfolios 
are reformed each calendar quarter and we report abnormal buy-and-hold returns for the six months beginning 
the first day of quarter t1. 


P? Tn a robustness test (untabulated), we find that inferences from our trading strategies are unchanged when we restrict the 
sample to include firms with available data for the 11 predictor variables. 
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stocks in the least favorable recommendation quintile and for those in the most favorable recom- 
mendation quintile. Our first observation is that the variability of the absolute value of the abnor- 
mal returns differs considerably among the three sub-periods. The variability is much higher in the 
1999—2003 sub-period than in the preceding 1994—1998 sub-period; return variability then drops 
precipitously and remains relatively low throughout the 2004—2006 sub-period. Our second ob- 
servation is that analyst recommendations are not very reliable as a prediztor of future returns. In 
fact, stocks in the least favorable recommendation quintile earn greater returns than those in the 
most favorable quintile in 35 of the 72 quarters. So, flipping a coin appears to be just as predictive. 


Returns for Portfolios Based on Analyst Recommendations or Short interest 


Table 5, Panel A reports average monthly abnormal returns based on analyst recommenda- 
tions. The portfolios cover the 52 calendar quarters from 1994 to 2006. № analyst recommenda- 
tions provide value to investors, then returns should be higher for moze highly recommended 
stocks. That is, returns for stocks in the quintile with the highest average recommendation should 
exceed returns for lower recommendation quintiles. Consistent with the results previously pre- 
sented in Table 4 (which indicate that analysts improperly incorporate value-relevant information 
into their forecasts), we find the opposite relation. We find that the mean, abnormal returns gen- 
erally decrease as the recommendation level increases. When we calculaté returns from investing 
long in firms classified in the most favorable recommendation quintile and taking an offsetting 
short position in firms in the least favorable recommendation quintile.. we obtain a modestly 
negative return of 26 monthly basis points (t — —2.03; p « 0.01). | 

Table 6, Panel A summarizes returns by recommendation quintile for each of the three sub- 
periods and provides statistical tests. The investment results vary considerably among the three 
sub-periods. However, the most favorable recommendation quintile does ‘not perform better than 
fourth among the quintiles (see ranks in the right-most column). Twa statistically significant 
portfolios fall in the 1999-2003 period, when returns are positive for the lowest recommendation 
quintiles. We also find that the hedge portfolio trading strategy (buying firms in the highest 
recommendation quintile and selling short those in the lowest recommenrlation quintile) yields a 
statistically significant return only in the 1999-2003 sub-period, and that те т is a negative 48 
monthly basis points. Overall, following the consensus analyst recommendation does not generate 
positive abnormal returns for investors and, in some periods, can actual y generate significantly 
negative returns. While perhaps counterintuitive, this result is consistent with Barniv et al. (2009), 
who find that analysts’ recommendations relate negatively to residual inZome valuation models, 
and with Seybert and Wang (2009), who find that firms with more optimistic recommendations 
earn lower future returns in periods of high investor sentiment.?? 

Table 5, Panel B reports average monthly abnormal returns from portfolios formed using short 
interest levels. We find that the average abnormal return across the quintiles declines monotoni- 
cally from lowest to highest short interest. A trading strategy that invests long (short) in firms that 
are in the lowest (highest) short interest quintile provides an average monthly abnormal return of 
56 basis points, which is both economically and statistically significant (t = 2.46; p < 0.01). 
Interestingly, 29 basis points of that return are earned because short sellers' avoid sizable positions 
in stocks that yield positive future returns. Thus, not only are short sellers able to identify stocks 
that are likely to fall in price, but they successfully avoid stocks that are likely to realize price 


Relative to JKKL, we find a more negative relation between analyst recommendations (both levels and changes) and 
future returns. Our sub-period analysis suggests that an important factor driving this resul i is the different time periods 
examined across the two studies. | 
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increases. This finding is consistent with concurrent research by Boehmer et al. (2010), who also 
find that the positive signals from low short interest can be equal to or greater in absolute mag- 
nitude than the negative signal from high short interest. 

Table 6, Panel B presents returns from trading on short interest levels for each of the three 
sub-periods. A monotonic pattern of decreasing returns occurs within each of the three sub-periods 
as one reads down the table from lowest to highest short interest, with several of the individual 
quintiles statistically significant. Over the 2004—2006 sub-period, the zero-investment hedge strat- 
egy produces a statistically significant abnormal return of 54 monthly basis points. The monthly 
hedge returns are similar for the other sub-periods but are not statistically significant. Thus, while 
larger abnormal returns can be earned by trading on the level of short interest rather than on 
analysts’ recommendations, the hedge strategy results are not of sufficient magnitude to be statis- 
tically significant in each sub-period. Interestingly, the returns to buying stocks in the lowest short 
interest quintile are more likely to be statistically significant than are the returns to selling short the 
stocks in the high short interest quintile. Overall, short interest provides a more reliable signal 
about which stocks to buy than about which stocks should be sold short. 


Returns for Portfolios Combining Analyst Recommendations and Short Interest Levels 


The evidence presented to this point indicates that none of the signals in isolation produce 
consistent results across each sub-period. In this section, we investigate whether combining the 
signals from analysts and short sellers can improve upon this investment performance. We con- 
sider trading strategies that are based on concurring and conflicting signals from analysts and short 
sellers. To conduct our analysis, we independently sort analysts’ recommendations and short 
interest into quintiles and merge these quintile rankings to form 25 different portfolios. Since the 
sorts are independent, the average number of stocks across portfolios is not equal. Nevertheless, as 
reported in Table 7, the sample is broadly distributed across the 25 different portfolios, ranging 
from an average number of firms per portfolio of 492 to 950. This suggests that analysts and short 
sellers are just as likely to disagree as they are to agree on the future prospects of any one firm. 

Table 7 presents abnormal returns for portfolios formed using both analyst recommendations 
and short interest. Reading down each recommendation column in Table 7, Panel A shows that 
abnormal returns tend to decline as the level of short interest increases. Notably, the decline is 
monotonic for the three quintiles with the most favorable recommendations. In addition, reading 
across each short interest row from left to right shows a general tendency for abnormal returns to 
decrease as the recommendation becomes more favorable. The combined effect is that the lowest 
returns occur in the lower right quadrant and the highest in the upper left quadrant. 

Table 7, Panel B reports returns for two zero-investment trading strategies. The first strategy 
is to trade when short sellers and analysts strongly concur about a company's prospects. The 
specific trade is to buy firms in the portfolio with the most favorable recommendations and lowest 
short interest, and to sell short firms with the worst recommendations and highest short interest. 
The upper half of Panel B reports the returns and t-statistics from this strategy. The sample period 
return is 36 monthly basis points, which is not statistically significant (t — 1.25; p - 0.21). 
Examining sub-periods, a positive abnormal return occurs in each sub-period; however, the return 
is only statistically significant during the 2004 to 2006 time period (at the 10 percent level). We 
also note that the returns in each sub-period are less than the returns available to investors by 
trading only on the level of short interest (see Table 6, Panel B). 

The second strategy is to trade when short sellers and analysts strongly conflict about a 
company's prospects. The strategy is to follow the short sellers by buying firms in the portfolio 
with the least favorable recommendations but the lowest short interest level, while selling short 
firms with the best recommendations but the highest short interest. The combined return reported 
in Panel B is 111 monthly basis points (40 -- 71 from the corner cells in Panel A), which is both 
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sizable and statistically significant (t = 4.09; p < 0.01). Positive returns &ccrue on both the long 
and short side, and the combined return is about double the return of 54 basis points available from 
trading only on the level of short interest (see Table 5, Panel B). This trading strategy also 
produces statistically significant returns of 110 basis points in 1994—1998, 130 basis points in 
1999—2003, and 71 basis points іп 2004—2006, providing evidence of the stability of this strategy 
over time. Note that the highest return occurs in the 1999—2003 time pericd when analyst recom- 
mendations are most misleading and produce a negative return of —48 monthly basis points (Table 
6, Panel A). 

In sum, we find that combining investment signals from analysts and short sellers yields 
incrementally greater future returns than trading strategies that use only one of the signals. Con- 
sistent with the results presented in Table 4, the most profitable investment strategy is when an 
investor trades in firms about which analysts and short sellers strongly disagree, and the investor 
takes the buy or sell position indicated by the short interest signal. 


Robustness Tests 


In this section, we report the results of several robustness tests (all untabulated). We begin by 
examining the sensitivity of the results to using the I/B/E/S provided consensus recommendation, 
rather than our self-constructed consensus. We find that our sample size increases slightly (to 
86,592 firm-quarter observations) with all results qualitatively the same as those reported. 

Next, we use an alternative short interest variable. Recall that for our jmain analyses, we use 
the short interest ratio (open short interest divided by shares outstanding). As an alternative 
deflator, we scale open short interest by the previous month's trading volume and label this 
variable SIVOL. We find that SIVOL is highly correlated with the short interest ratio (p = 0.74). 
When we regress the quintile assignment of SIVOL on the 11 predictive variables, we again find 
that all 11 variables are statistically significant in the expected direction, which is consistent with 
the results using the short interest ratio. 

Our remaining robustness tests reexamine the profitability of the trading strategies that use 
information from both analysts and short sellers. First, we examine whether our results are robust 
to monthly rebalancing of the portfolio and using a holding period of one month. Consistent with 
our main results, we find that the most profitable strategy is to follow the|short sellers when the 
signals from analysts and short sellers strongly conflict (100 monthly basis points; t = 3.39; р < 
0.01). The alternative strategy, which trades when analyst and short seller signals strongly concur, 
remains less profitable although it improves from 36 monthly basis points (t = 1.25; p = 0.21) to 
75 basis points (t — 2.14; p « 0.01). 

Second, we examine whetber the returns to our hedge portfolios become more profitable when 
we use a finer partition to form portfolios. Each month, we sort firms into deciles (instead of 
quintiles) based on the consensus analyst recommendation and/or the leve] of short interest. We 
intersect the recommendation and short interest deciles to produce 100 portfolios and the strategies 
take positions in thé four most extreme portfolios using a six-month holding period. We find that 
the more refined stock selection yields greater hedge portfolio returns (140 monthly basis points; 

= 3.35; p < 0.01). 

Finally, we estimate tbe returns to our trading strategies using calendar-time portfolios, as an 
alternative method for estimating abnormal returns. Specifically, we assess the profitability of the 
strategies by estimating alphas from a regression of each portfolio's time-series of excess returns 
on the Fama-French risk factors and momentum (Fama and French 1993; Carhart 1997). We find 
that all hedge portfolio returns maintain statistical significance, which is censistent with the main 
results that calculate abnormal returns using characteristic portfolio matching (Daniel et al. 1997). 
We again find that the most profitable strategy is to follow the short sellers when their positions 
conflict with analyst recommendations. 
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V. CONCLUSION 

We contribute new findings on the characteristics of stocks favored by short sellers, about 
how those characteristics differ from those used by analysts in developing buy-hold-sell recom- 
mendations, and on the value relevance of short interest data to investors. By so doing, we expand 
upon the results of Dechow et al. (2001) and other studies that examine information used by short 
sellers?! First, we find that analysts and short sellers use publicly available information differ- 
ently. Analysts over-recommend stocks with high growth, high accruals, and low book-to-market 
ratios, even though prior research shows these characteristics are negatively related to future 
returns. In contrast, short sellers incorporate into their investment decisions the future return 
implications of all 11 accounting and market variables considered in this study. Second, we find 
that short interest provides information about future returns beyond that provided in the 11 items 
of information that prior research shows to be predictive of future returns. Analysts" recommen- 
dations also provide incremental information, but a negative coefficient suggests trading against 
the analysts. Third, based on these results, we show that a highly profitable trading strategy is one 
where investors trade with the short sellers when the short interest signal strongly conflicts with 
the consensus analyst recommendation. In fact, the value of short interest in choosing stocks to 
buy or sell is greater when conditioned on a conflicting consensus recommendation than when 
used by itself to trade stocks. 

Our study contributes to the stream of academic literature documenting market inefficiencies. 
The debate over market efficiency has shifted from simple yes-or-no questions to issues such as 
the types of information incorporated into prices with a delay, the speed of price adjustment, and 
factors that facilitate or impede price discovery. We consider fundamental and other predictive 
information that prior research has shown to be incorporated into prices with a delay. We show 
that analysts' recommendations can impede price discovery of this information, but short selling 
facilitates price discovery. À frequent criticism of research documenting a delayed reaction to 
information is that a relation between information and future returns found to exist in the past may 
not recur in the future. That is, while some stocks are always misvalued, the characteristics of 
those stocks change over time, reducing the effectiveness of any system that places fixed weights 
on information. This criticism is less likely to be true for our approach, which identifies misvalued 
Stocks based on a (strong) difference of opinion between analysts and short sellers. The charac- 
teristics of misvalued stocks can change over time, as long as short sellers and analysts disagree 
about valuation and the current stock price under-weights the views of short sellers. The extent of 
disagreement is also likely to be greatest in periods of high return volatility, and this is the type of 
market environment in which investors want stock-picking guidance. Consistent with this conjec- 
ture, we find the highest returns during the 1999-2003 sub-period. 

Our study is timely in light of recent actions in the U.S. and other countries to further regulate 
short selling. For example, SEC Release No. 34-58591 (SEC 2008) requires institutional managers 
with at least $100 million under management to report detailed information about daily short sales 
in new Form SH. The rationale in Release No. 34-58591 is that "sudden and unexplained declines 
in the prices of securities ... can give rise to questions about the underlying financial condition of 
an issuer, which in turn can create a crisis of confidence without a fundamental underlying 
basis.””” While our evidence does not include the specific time period referred to in this quote, any 
new regulations are likely to extend to more normal periods of market activity. An important 
implication of our study is that regulations that restrict or increase the cost of short selling run the 


21 Two concurrent working papers consider aspects of short seller information use (Cao et al. 2007; Seybert and Wang 
2009). 

22 In SEC Release No. 58724 (October 2, 2008), the SEC states the daily short selling information reported on Form SH 
will not be publicly available, in part, because it could give rise to imitative short selling. 
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risk of limiting a potentially important source of information for investors about future equity 
values. In this regard, the SEC has recently taken actions to increase the public availability of short 
interest positions. On March 6, 2007, the SEC approved rule changes that increase the frequency 
of short interest reporting from monthly to twice a month, effective September 2007. More timely 
reporting of short interest data to the public should further increase the role of short sellers as an 
information intermediary. 


' 


APPENDIX 
QUANTITATIVE INVESTMENT SIGNALS . 

The last month of each calendar quarter is labeled quarter t. On this date, we measure our 
Stock recommendation and short interest variables. Relative to this date, we label as quarter q the 
most recent fiscal quarter for which an earnings announcement is made at: least two months prior 
to the end of quarter t and no more than four quarters prior to the end of quarter 1. 
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Incentive Compensation and Promotion-Based 
Incentives of Mid-Level Managers: Evidence 
from a Multinational Corporation 
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ABSTRACT: This study re-examines the hypothesis that explicit, compensation-based 
incentives of mid-level managers are adjusted to the level of implicit incentives provided 
by the possibility of moving to higher-level positions. Using compensation data from a 
large multinational corporation, | find that, after controlling for the position’s scope and 
level of accountability, bonus-based incentives are stronger for managers who (1) have 
fewer organizational levels left to climb, (2) face weaker implicit incentives from getting 
promoted to the next level, and (3) face weaker implicit incentives from getting pro- 
moted to the top of the organization. The findings are consistent with the notion that 
implicit incentives are taken into consideration in the design of explicit incentive con- 
tracts. In particular, the results support the prediction that explicit incentives are opti- 
mally stronger in situations with weaker implicit incentives. 


Keywords: implicit incentives; incentive compensation; promotions; career concerns. 
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I. INFRODUCTION 
he theoretical literature has argued that managers who face weaker promotion-based im- 
| plicit incentives should optimally receive stronger explicit variable-pay-based incentives 
(Gibbons and Murphy 1992; Gibbs 1995). Despite well-developed theoretical arguments, 
empirical studies have had limited success providing evidence that implicit, promotion-based 
incentives are taken into consideration in the design of explicit incentive compensation contracts.! 
The present study revisits this hypothesis. 
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I analyze a sample of mid-level managers who can be directly compazed with respect to their 
positions, but who occupy different positions in their respective hierarchies,and who face different 
promotion possibilities and rewards upon getting promoted. Thus, the setting provides an oppor- 
tunity to observe variation іп the strength of the implicit, promotion-based incentives while con- 
trolling for many confounding factors. In particular, the analyses in this study are based on 
compensation data from a large multinational corporation that operates ün over 100 countries 
around the world. The company is organized around five main divisions, which in turn are com- 
prised of 30 subdivisions. Each of the divisions and subdivisions is represented in many different 
countries. This matrix-like organizational structure allows me to directly compare positions across 
countries. 

I find that the explicit incentives provided by the company's bonus plan are stronger for 
managers who are positioned at higher organizational levels, face weaker implicit incentives from 
getting promoted to the next level, and face weaker implicit incentives from getting promoted to 
the top of the organization, after controlling for the position's scope and level of accountability. 
These findings are consistent with the theoretical argument that implicit incentives should be taken 
into consideration in designing explicit incentive contracts. More precisely, the evidence presented 
here supports the prediction that explicit incentives are optimally stronger in situations where 
implicit incentives are weaker (Gibbons and Murphy 1992; Gibbs 1995). | 

This study makes several contributions to the incentive compensation literature. First, I extend 
prior empirical work in accounting that examines the incentive intensity bf mid-level managers 
(Baiman et al. 1995; Nagar 2002) to consider the role of implicit promotion-based incentives, 
thereby following the call in Bushman and Smith (2001) for more research on the interactions 
between incentive contracting and other organizational features. In particnlar, the present study 
documents that the intensity of explicit incentives is higher in situations that pose weaker implicit, 
promotion-based incentives. 

Second, this study contributes to prior studies that look at the inteiplay between explicit 
incentive contracts and implicit incentives arising from the possibility of career advancement (e.g., 
Kahn and Sherer 1990; Gibbons and Murphy 1992). Those studies primarily focus on the length 
of the manager's career horizon as the measure of the strength of implicit incentives. This study 
adds to that literature by analyzing an innovative setting that provides the: opportunity to isolate 
the strength of implicit, promotion-based incentives. The analyses in this svidy provide additional 
empirical evidence on the importance of career-based incentives in contrazt design. 

Arguably most closely related to this study, Gibbs (1995) analyzes the compensation-based 
incentives of employees who have been passed over for promotion. The: author does not find 
significant differences between the explicit incentives of employees who Науе been passed over 
for promotion and those of employees who have not been passed over for promotion. Gibbs (1995) 
hypothesizes that the lack of evidence could be attributable to a centrally administered incentive 
scheme, which may not allow for variation at the employee level. In zontrast to the setting 
analyzed by Gibbs (1995), the empirical setting in this study provides the following advantages. 
First, the data set used in this study includes explicit information on the parameters of the incen- 
tive contracts, such as the expected bonuses, which allows for a more precise measurement of the 
strength of the bonus-based incentives. Second, although Gibbs (1995) and the present study both 
analyze data from a single company, in my empirical setting, the explicit incentives are determined 
by the individual country organizations. Thus, my analysis circumvents te issue of a centrally 
administered incentive plan. 

The ability to generalize the results of this study is limited by the analysis of a single firm. 
However, the research site chosen for this study offers several advantages'for empirical investi- 
gation of the role of implicit, promotion-based incentives in incentive contyacts provided to mid- 
level managers. First, the organizational structure of the company, combinzd with the company’s 
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job-rating project, allows for a direct comparison of managers in different countries. Second, the 
compensation practices in the different countries reflect the local labor markets and, thus, are not 
specific to the company that is studied. Finally, the incentives provided to the managers through 
the company's bonus plan are not based on company-wide guidelines, but are decided by the 
individual country organizations. 

The next section develops the hypotheses. Section III describes the research site, the sample 
and the measures used in the empirical analyses. Section IV discusses the research design and the 
empirical results. Section V provides a summary and conclusion. 


И. HYPOTHESIS DEVELOPMENT 

This study analyzes the incentives of mid-level managers who work in a corporate hierarchy. 
In particular, I investigate whether explicit incentives that are provided by variable-pay-based 
Schemes are adjusted, based on the level of implicit incentives that are provided by the possibility 
of moving to higher-level positions in the organization.” 

The argument that explicit, variable-pay-based incentives are optimally stronger in situations 
where implicit, promotion-based incentives are weaker has been formalized in career concerns 
models that allow for the presence of explicit incentive contracts (Gibbons and Murphy 1992; 
Gibbs 1995). 

In particular, Gibbons and Murphy (1992) investigate optimal (explicit) incentive contracts 
when the agent faces implicit incentives from the possibility of career advancement in a competi- 
tive external labor market. The analysis in Gibbons and Murphy (1992) is based on a multiperiod 
model with a single performance measure, which is a function of the agent’s innate ability and the 
agent’s effort that is provided during the period. The performance metric is used in the explicit 
incentive contract and is also used by the external labor market to update beliefs about the agent’s 
ability. The compensation that is offered to the agent in the second period in the competitive labor 
market is increasing in the market’s assessment of his/her ability. Implicit incentives arise in this 
setting because ability and effort cannot be fully separated and the agent has the incentive to 
increase effort to influence the labor market’s beliefs about his/her ability.” The implicit incentives 
provided by career concerns are stronger when future compensation is more valuable to the agent 
as is the case when the agent is further away from retirement. Given this setup, the analysis shows 
that the optimal explicit incentives provided by the agent’s compensation scheme are decreasing in 
the implicit incentives provided by career concerns. 

Although Gibbons and Murphy (1992) examine implicit incentives that arise in a competitive 
external labor market, their analysis can also be interpreted in the light of an internal labor market, 
as it is discussed by the authors on pages 469—470. In particular, career concerns also arise in an 
internal labor market if the employee's supervisor cannot perfectly distinguish between the em- 
ployee's ability and his/her effort. The authors argue that in an internal labor market setting 
explicit incentives should be strongest for workers with weak promotion-based incentives such as 
workers at the top of the corporate hierarchy. 

Consistent with the result in Gibbons and Murphy (1992), Gibbs (1995), using a single-period 
model, shows that the explicit incentives provided by a compensation scheme are optimally 
stronger when the implicit incentives provided by the possibility of being promoted are weaker. In 
Gibbs’ (1995) model, the promotion decision depends on the outcome of the performance mea- 
sure, which again provides incentives for the agent to increase his/her effort. 


2 Promotion-based incentives аге one source of implicit incentives. Other forms of implicit incentives include incentives 
that are based on non-contractible performance measures (see, e.g., Ederhof 2010). 

3 In equilibrium, the market's conjecture about the worker's ability is correct but the agent will exert higher effort than in 
the absence of career concerns because the market discounts his/her effort. 
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An important feature of the models in Gibbons and Murphy (1992) and Gibbs (1995) is that 
the strength of the implicit incentives is not a choice variable for the principal. In other words, 
both models characterize how the optimal explicit incentive contract should be designed for a 
given level of implicit incentives. This is in contrast to Lazear and Rosen (1981) and Rosen (1986) 
who analyze how the principal should optimally choose the compensation structure across hierar- 
chical levels in order to optimize the resulting implicit incentives. It seems reasonable to assume 
that, in my empirical setting, the implicit incentives provided by the possibility of promotion are 
determined exogenously with respect to the manager’s compensation scheme. The company in- 
vestigated in this study is organized along five main divisions with operations in many different 
countries. The purpose of the local units of a division in the different countries is to implement the 
global strategy of the division in the individual countries. The organizational structure of the local 
units is largely standardized. For example, all worldwide local units of a division are organized 
around a local unit manager whose authority and responsibilities follow worldwide guidelines. 
Thus, it seems unlikely that the company adapts its organizational form in a given country to the 
local compensation structure. In other words, it seems unlikely that the Company structures its 
organizational form around the local labor market in order to optimize the implicit incentives 
resulting from the possibility of career advancement. Moreover, the compensation paid to the local 
managers is dictated by the local labor market conditions (Gibbs 1995). 

Broadly speaking, the result of the analyses in Gibbons and Murphy (1992) and Gibbs (1995) 
is that optimal explicit incentives are decreasing in the strength of the implicit incentives provided 
by the possibility of career advancement. With respect to the setting of a corporate hierarchy, the 
strength of the implicit incentives is determined by the extent to which additional effort changes 
the probability of getting promoted and the "prize" that the manager is awarded upon promotion. 
The prize of getting promoted, in turn, is. comprised of the immediate increase in compensation 
and the option value of being eligible for future rewards deriving from further promotions (Rosen 
1986; Gibbs 1995). 

An important determinant of the strength of the implicit incentives that a manager faces is 
his/her hierarchical position, because managers who are closer to the top of their organization have 
"truncated" promotion paths. Thus, high-rank managers are expected to have stronger explicit 
incentives. However, managers who are at higher organizational levels are likely to have stronger 
explicit incentives due to their job characteristics. In particular, managers at higher hierarchical 
levels are likely to have higher marginal productivities with respect to their effort (Baker and Hall 
2004). Moreover, managers at high organizational ranks are likely to have more decision-making 
authority and to have a larger span of control (Prendergast 2002; Nagar 2002; Wulf 2007). In order 
to isolate the strength of implicit incentives, I use various control variables io capture variation in 
such job characteristics in the empirical analysis. The hypothesis can be stated as follows (ex- 
pressed in the alternative form): 


Hypothesis: The explicit incentives provided by the variable-pay scheme are decreasing in 
the strength of the promotion-based implicit incentives that the manager faces, 
ceteris paribus. 


Ш. RESEARCH SETTING AND MEASURES 

Research Site 
Itest the hypothesis using data from a large multinational engineering corporation that oper- 
ates in approximately 100 countries. The company primarily sells technology to utility and indus- 
try customers. The company's operations are organized into five main divisions, which in turn are 
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comprised of 30 subdivisions.’ The divisions are the central building blocks of the organization; 
they run the business lines from R&D to sales and they have primary P&L responsibility. For each 
of the divisions and subdivisions, a manager is responsible for the unit's worldwide operations. 
For each country in which a unit operates, a local manager is responsible for the unit's local 
operations. In addition, the countries have some infrastructure in the form of a country manage- 
ment team and support functions such as human resources, finance, legal, and communication. 

I test the hypothesis by comparing the incentives of the local managers in the different 
countries. In particular, the sample is comprised of all local managers who are in positions that are 
part of their local organizational “ladders,” which go all the way to the top of the organizations. In 
other words, the sample includes managers who hold positions that make them eligible to even- 
tually rise to the top of their respective organization. The sample excludes positions like IT and 
legal, which constitute support functions at the company and for which the promotion possibilities 
of the positions’ holders are limited. 


The Company's Job-Rating Project ; 

In 2005, with the help of a consulting firm, the company started a project of assigning 
numerical ratings to the top positions in the company. The company initiated the project for the 
following reasons. First, and important for this study, the company wanted to generate a picture of 
the organization's hierarchy that reflects managers’ promotion paths and that is independent of 
existing job titles.Ü In interviews with the author, the company contact emphasized that employees 
often have mental models of the company hierarchy that is based on job titles that do not corre- 
spond to the true hierarchy of the organization. The company hopes that the job-rating project will 
improve the promotion process, in that it will facilitate, for example, promoting people to jobs that 
have a higher rating but that have the identical title as the employee's current position. 

Second, the company wanted to gain an understanding of how comparable positions are 
compensated in the different countries. In interviews with the author, the company contact em- 
phasized that, to that end, a key aspect of the job-rating project was that the ratings were not 
influenced by the current position holder's compensation. Moreover, tn order to facilitate compa- 
rability across countries, the ratings were assigned based on a standardized scheme, which the 
consulting company developed based on an initial pilot study. 

The ratings are assigned to a position, and not to the person currently holding the position. 
Thus, the ratings are independent of the current manager's performance. The ratings are assigned 
based on a combination of factors primarily capturing the position's scope and level of account- 
ability. In particular, a position's scope is captured by the number of employees that a manager 
oversees and the revenue figure for which the manager is responsible. The level of accountability 
is largely captured by the decision-rights that are in the manager's hand. 

In interviews with the author, the company contact emphasized that promotions occur from 
one rating category to the next. In 2008, the company completed rating all positions that fall into 
the top ten rating categories. The position of the chief executive officer is assigned a rating of 1; 
an example of a job with a rating of 10 is a manager who is in charge of the operations of a local 
subdivision with revenue of $22 million and 40 employees. 


^ Subsequently, I collectively refer to divisions and subdivisions as "units." 

5 The classification of positions into "organizational ladder" and "support functions" is based on extensive interviews with 
the company contact. 

6 The notion that the organization's hierarchy is based on promotion paths is consistent, for example, with the way the 
hierarchy in Baker et al. (1994) and Gibbs (1995) is defined. 
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The Company’s Incentive System 
Aside from a fixed salary, all managers in the sample are eligible for a bonus payment. The 
company has guidelines that pertain to all worldwide participants in the bonus plan. In particular, 
the performance measures used in the bonus plan, the weight that is placed on firm-wide versus 
divisional measures, and aspects of the pay-performance relation must follow the company’s 
guidelines. In contrast, and important for this study, the company does not have worldwide guide- 
lines with respect to participants’ expected bonuses—i.e., the compensation that is paid out in the 
form of a bonus when the performance meets expectations. In other words, the company does not 
have guidelines with respect to the percentage of a participant’s cash compensation that is vari- 
able. The level of the expected bonus, which is typically expressed as a percentage of base salary, 
is determined by the respective country management. Some countries interpret the bonus plan as 
a guaranteed 13th-month salary for all participants. These countries are excluded from the analy- 
sis. 
In addition to the bonus plan, the company awards stock options to the top employees of the 
organization who hold positions that are largely in the top seven rating categories. In contrast to 
the company’s bonus plan, all aspects of the stock option plan are centrally administered by the 
company’s headquarters and follow worldwide company guidelines. In awarding stock options in 
2008, the company made use of the newly available job ratings. In particular, the number of 
options awarded to managers in a given job category is fixed for a given job-rating category. 


Sample and Measures 

The analyses in this study are based on a data set that contains inforrnation for 1,151 man- 
agers in 14 countries that have been assigned ratings between 2 and 10. The data are largely for the 
year 2008. Table 1 shows the distribution of the sample across different job-rating categories and 
countries. 

Table 2 and Figure 1 summarize compensation levels across organizational levels for the 
different countries. Specifically, the table and graph show how the median) total cash compensa- 
tion, which is calculated by summing base salary and expected bonus, varies across job categories 
in the different countries. 

The compensation figures are expressed relative to the median compensation levels at job- 
rating category 10 in the respective country, which are normalized to 1.00." As one would expect, 
for the most part, the values of the compensation ratios are increasing as managers move to 
higher-level job categories. Visual inspection of Figure 1 suggests that the overall pay structures 
can be characterized as convex, which is consistent with findings in prior studies and the predic- 
tion from tournament theory (e.g., Lambert et al. 1993; Rosen 1986). Table 2 and Figure 1 also 
indicate substantial variation in the pay structures across countries. For example, the ratio of 
median pay levels in category 5 to category 10 is approximately 2 in Germany, but more than 4 in 
the United Kingdom. 


Explicit Incentives 

Testing the hypothesis requires a measure of the explicit incentives provided by the compen- 
sation scheme. As discussed above, stock options are awarded uniformly across countries and the 
number of options awarded is fixed for a given job-rating category. Therefore, I focus attention on 
the company’s bonus plan. Conceptually, the strength of the incentives provided by a bonus plan 
is reflected in how much the agent’s compensation increases when s/he increases his/her effort. As 
discussed above, the company has worldwide guidelines with respect to certain parameters of the 


7 Confidentiality reasons preclude me from reporting dollar amounts. 
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TABLE 2 


Median Pay Levels across Job-Rating Categories for the Different Countries 


(expressed relative to the median pay level for category 10 in the respective country)? 
Job- 

















Ratin 
Category Sweden Finland Germany Italy Switzerland Norway Poland 
2 5.93 — — — — — — 
3 — 5.69 3.26 — — — — 
4 3.81 — — 2.77 3.40 — — 
5 2.80 — 2.05 2.42 2.34 2.50 3.76 
6 1.98 2.45 1.64 2.80 2.06 1.66 
7 1.65 1.65 1.46 2.29 1.80 1.69 — 
8 1.25 1.31 1.15 1.43 1.29 1.40 1.69 
9 1.08 1.23 1.04 1.27 1.07 1.12 124 
10 1.00 1.00 1.00 1.00 1.00 1.00 1.00 
чор: NUN ЕЕЕ “РТ -— RN 
Rating United United New 
Category Russia Spain Kingdom States Denmark Zealand Turkey 
2 zt = = = Е = = 
3 = = = T =, ге — 
4 = 2n em = = | == "- 
5 3.81 2.89 4.35 3.16 = EE — 
6 3.28 2.06 3.24 1.83 3.38 2.88 4.39 
7 1.94 1.70 1.77 1.63 1.16 — 4.31 
8 1.60 1.48 1.54 1.38 1.10 1.66 27 
9 1.38 1.15 1.08 1.11 0.79 1.24 1.57 
10 1.00 1.00 1.00 1.00 1.00 1.00 1.00 


* Pay is the sum of the manager's base salary and his/her expected bonus. Expatriates are excluded from the statistics. 


bonus plan. In particular, the payoff function is linear and an increase in tierformance leads to a 
fixed percentage increase in bonus payout. For example, the bonus payout for the maximum 
performance is twice as much as the bonus payout for the target performance for all plan partici- 
pants. Therefore, variation in the expected bonus reflects variation in the strength of the incentives 
provided by the bonus plan. Thus, I measure the strength of the explicit incentives provided by the 
bonus scheme using the ratio of the target bonus to base salary (TB) (Indjejikian and Nanda 2002). 


Implicit Incentives 

The hypothesis predicts that managers who face stronger promotion-based implicit incentives 
have lower variable-pay-based explicit incentives. As discussed above, the strength of the implicit 
incentives is determined by the extent to which additional effort changes the! probability of getting 
promoted and the “prize” that the manager is awarded upon promotion! The prize of getting 
promoted, in turn, is comprised of the immediate increase in compensation and the option value of 
being eligible for future rewards deriving from further promotions (Rosen 1986; Gibbs 1995). I 
use different measures to capture the strength of the implicit incentives. | 

First, as argued above, managers at higher hierarchical positions in their respective organiza- 
tions have truncated promotion paths and are thus expected to have lower implicit incentives. In 
order to capture a manager’s hierarchical position in his/her country organization, I include the job 
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FIGURE 1 
(Median Cash Pay at Respective Level/Median Cash Pay at Level 10)* 
7 —— Sweden 
—HBi— Finland 
m —1— Germany 
5 —— Italy 
=== Switzerland 


4 —6e— Norway 








—+— Poland 
Sca Russia 
2 Spain 
e$. UK 
y 1 --m--us 
---й--- Denmark 
= 0 


= New Zealand 


11 10 9 8 7 6 5 4 3 2 1 0 
Job rating categories 


= 5-— Turkey 


2 Expatriates are excluded from the statistics. 


rating of the highest-ranking manager in the respective country, HIGHESTPOSITION, in the 
analysis. 

As described above, the strength of the promotion-based implicit incentives is a function of 
the prize that the manager receives upon promotion and the extent to which additional effort 
changes the probability of getting promoted. I develop two additional measures to capture those 
features. 

It is unobservable how additional effort changes the probability of getting promoted. How- 
ever, Gibbs (1995, 1996) has shown that the derivative of the probability of getting promoted with 
respect to the agent's effort is increasing in the promotion probability as long as the promotion 
probability is below one-half (also see Campbell 2008). Interviews with the company contact 
confirmed that it can reasonably be assumed that, for the managers in my sample, the implicit 
incentives are increasing in the promotion probabilities, as the promotion rates at the company are 
sufficiently low. 

Data limitations prevent the direct calculation of the promotion probabilities in the individual 
job categories. However, the company contact emphasized that dismissals and demotions are fairly 
rare in this company (also see Gibbs 1995). Thus, I employ the median tenure at the individual job 
levels as a proxy for promotion probabilities. More precisely, I use the inverse of the median 
tenure in the manager's current job-rating category to capture promotion possibilities. 
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Focusing on the manager’s immediate promotion possibility, I construct a measure that takes 
the compensation differential between the manager's current job and the next-higher job-rating 
category, and the manager's chance of being promoted to that next level into consideration. 
Specifically, I compute the ratio of the median expected cash compensation for the next-higher 
job-rating category to manager's expected cash compensation (in his/her current job), both for the 
manager's respective country. I then multiply that ratio by the inverse of the median tenure in the 
manager's current job-rating category in the respective country. The measure is denoted by IM- 
PLICITNEXT. 

Similarly, consistent with Rosen (1986), I develop a measure that takes the median tenure and 
compensation level at the manager's current job and at each higher-level category in the respective 
organization into consideration. Specifically, I calculate the sum of the discounted compensation 
differentials between the manager's current job and the top of the respective organization. The 
compensation differential between two levels is discounted by the cumulat:ve probability of being 
promoted to the respective level, where the probability of promotion is again proxied for by the 
inverse of the median tenure. I denote this measure as IMPLICITTOP. 


Control Variables 

Incentive compensation practices vary systematically across countries. As discussed in more 
detail in the research design section, I employ a measure that captures the general incentive 
intensity in the respective country. Specifically, I use a measure that is provided by the company's 
consulting company that captures the median incentive intensity for mid-level managers for a 
large number of industrial companies in the different countries." | 

Prior literature has argued that division managers who have more decision-making authority 
receive more incentive-based pay because the potential for misuse is stronger if managers have 
more authority (Prendergast 2002; Nagar 2002; Wulf 2007). I use the jot ‘rating assigned to the 
manager's position, CATEGORY, as a proxy for the manager's decision-making authority. Ав 
described above, the ratings are assigned based on factors that primarily capture the positions' 
scope and level of accountability, an important determinant of which is the manager's level of 
decision-making authority. 

A standard result in agency theory is that managers who have a highe- marginal productivity 
with respect to effort should optimally have stronger explicit incentives (Baker and Hall 2004). In 
order to control for differences in managers’ marginal productivities, following Wulf (2007), I 
include a measure of the relative importance of the manager's unit in the analyses. In particular, I 
measure a unit's relative importance by the ratio of the sales of the umit that the manager is 
affiliated with to the total sales in the respective country (RSALES)? 

Theory suggests that more noise in the performance measures increases the risk that the 
manager is exposed to and the prediction is that incentives are lowered when the risk exposure is 
higher (Holmstrom 1979). 10 As described above, the performance measures that are used in the 
bonus plans follow company guidelines in the sense that, worldwide, managers in the same 
position have the same performance measures in their bonus plan. Thus, managers in similar 
positions in different countries could be exposed to different levels of noise because they operate 
in different environments. Due to data limitations, I rely on the measure capturing the general 
incentive intensity in the different countries in order to capture differences.in noise levels. 

I also include the expected growth in sales in the regression analysis. I measure expected 
growth in sales by the ratio of budgeted sales for 2008 to the actual sales number for 2007 





* Тао not report descriptive statistics on this measure due to confidentiality reasons. 
? Also see Baiman et al. (1995). | 
0 Also see Prendergast (2000) for arguments why the relationship between uncertainty and incentives could be positive. 
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(SALESGROWTH). This measure can be interpreted as a proxy for the firm's investment oppor- 
tunities. Prior literature has argued that firms with greater growth opportunities employ compen- 
sation contracts with greater incentive intensity. Smith and Watts (1992) argue that the observabil- 
ity of managers' actions decreases with the firm's growth opportunities. In contrast, the actions of 
managers in low-growth firms are argued to be more observable because these actions are largely 
focused on the maintenance and supervision of existing assets (Gaver and Gaver 1993; Holthausen 
et al. 1995). 

Agency theory has also argued that the optimal incentive intensity depends on the level of 
monitoring (e.g., Jensen and Meckling 1976; Prendergast 2002; Liang et al. 2008). In particular, 
Jensen and Meckling (1976) argue that incentive contracting and monitoring are alternative solu- 
tions to the moral hazard problem. As discussed above, the company is organized around five main 
divisions, which in turn are comprised of 30 subdivisions. The individual country organizations 
are structured around a country management team. Moreover, the local divisions and subdivisions 
are led by local division and subdivision managers. It seems reasonable to expect that direct 
monitoring of the actions of managers who are at the top of an organization is more difficult than 
monitoring the actions of lower-level managers. In order to capture such differences, I include an 
indicator variable, TOPMANAGER, indicating whether a manager is a highest-ranking manager in 
his/her country. Moreover, it seems plausible that local division managers do not receive as much 
monitoring as managers who are at lower organizational levels. Specifically, local division man- 
agers lead the respective business lines in their country. Thus, they are the highest-ranking man- 
agers in their respective fields of expertise. In order to capture potential differences in the moni- 
toring of local division managers, I include an additional indicator variable, DIVMANAGER, in the 
tests. All measures are defined in Table 3. 


Descriptive Statistics 


Of the 1,151 managers in the sample, 18 are expatriates. In the company, expatriates are 
compensated based on the norms in their home country with certain adjustments, such as for 
hardship and allowance. Thus, expatriates are excluded from the analyses. With the exception of 
Italy, the expatriates hold lower-level positions, with ratings between 6 and 10. In Italy, the 
highest-ranking manager is an expatriate, which precludes calculation of the variable IMPLICIT- 
TOP for the observations from that country. 

Table 4 provides descriptive statistics for the variables used in the analysis, calculating such 
statistics for each job-rating category using the pooled sample across all countries. 

The mean (median) values for the dependent variable in the analyses, TB, which captures the 
ratio of (target bonus/salary), are 0.18 (0.14) for managers at level 10 and 0.39 (0.39) for managers 
who hold positions in job-rating category 3. Overall, the descriptive statistics indicate an increas- 
ing trend of TB as one moves to higher-level positions, which is consistent with the expectation 
that managers with more decision-making authority receive more incentive-based pay. However, 
the mean and median values for TB are fairly constant for job-rating categories 2 through 5. One 
issue that should be kept in mind is that the job-rating categories are not distributed evenly across 
countries and that there are country-specific differences in the level of incentive compensation. 
The finding that the mean and median values for 7B are fairly constant for job-rating categories 2 
through 5 could also be influenced by the fact that there is no one-to-one mapping between 
job-rating categories and hierarchical levels. Specifically, each of the categories 2 through 5 


1 Lambert and Larcker (1987) explain why the relationship between expected sales growth and the strength of the 
incentives could also be negative. 
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TABLE 3 
Measures 


Measure Reflecting the Strength of Explicit Incentives 
ТВ (target bonus,/base salary;); both for manager i. 


Measures Reflecting the Strength of the Implicit Promotion-Based Incentives 


HIGHESTPOSITION; job-rating category of the highest-ranking manager in country j. 


IMPLICITNEXT;; [(median expected cash рау, „ерес cash рау) * (1/median tenure,,)]: the 


ratio of the median cash pay 


base salary + expected bonus) for job-rating 


category 2-1 in country j to the manager’s expected cash pay in his/her current 


position. 


IMPLICITTOP,; (median expected cash рау; / expected cash pay;)*(1 / 
1 А 


median tenure;;) 


+ > [(median expected cash рауја-1 / median expected cash pay ja) 


a=] 
t-a 


* [I (1/ median tenure;,,;)]: sum of the discounted compensation differentials 


b=0 
between the manager's current job-rating category t and 
respective country organization j. 
Control Variables 
CATEGORY; job rating assigned to manager i's position. 


the top of the 


RSALES;; (sales of unit K/total sales of country j that unit k is located in). 


SALESGROWTH;; (budgeted sales for 2008,/actual sales for 2007,); both for unit k in country j. 


TOPMANAGER;; indicator variable that is equal to 1 if CATEGORY indicates that the manager in 
job-rating category t is a highest-ranking manager in coüntry j, 0 otherwise. 


РГУМАМАСЕК indicator variable that is equal to 1 if manager i is in ch 
0 otherwise. 


H 


arge of a local division, 


includes a high percentage of managers who are highest-ranking in their respective country orga- 


nization. 


The descriptive statistics for IMPLICITNEXT suggest that the implicit incentives provided by 


the prospect of solely moving up one job category are largely increasing in| 
Specifically, the mean (median) values for IMPLICITNEXT are 0.38 (0.37) 


the hierarchical level. 
for managers at level 


10 and 0.61 (0.63) for managers who hold positions in job-rating category 5. This trend is 
consistent with the convex pay structures documented in Table 2 and Figure 1. 
The variable JMPLICITTOP captures the implicit incentives provided by the possibility of 


moving to the top of the respective organization. Although the descriptive 
cate an upward trend with respective to the mean levels as one moves 
hierarchy, they do not suggest a strong trend with respect to the median val 


statistics largely indi- 
up the organizational 
ues. Specifically, with 


the exception of the figure for job-rating category 6, the median values of IMPLICITTOP are 


around 0.60 to 0.65 across categories. This finding is consistent with the 


notion in tournament 


theory that convex pay structures result in constant implicit incentives throughout the hierarchy 
(Rosen 1986). Broadly speaking, the intuition is as follows. On the one hand, managers who are 
further down in the hierarchy face a higher number of organizational levels that they could 
potentially climb, which increases the number of terms that comprise thé option value, which 
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CATEGORY — 2 
TB 
IMPLICITNEXT 
IMPLICITTOP 
TOPMANAGER 
DIVMANAGER 
RSALES 
SALESGROWTH 


CATEGORY - 3 
TB 
JMPLICITNEXT 
IMPLICITTOP 
TOPMANAGER 
DIVMANAGER 
RSALES 
SALESGROWTH 


CATEGORY - 4 
TB 
IMPLICITNEXT 
IMPLICITTOP 
TOPMANAGER 
DIVMANAGER 
RSALES 
SALESGROWTH 


CATEGORY - 5 
TB 
IMPLICITNEXT 
IMPLICITTOP 
TOPMANAGER 
DIVMANAGER 
RSALES 
SALESGROWTH 


CATEGORY - 6 
TB 
IMPLICITNEXT 
IMPLICITTOP 
TOPMANAGER 
DIVMANAGER 
RSALES 
SALESGROWTH 


CATEGORY = 7 
ТВ 
IMPLICITNEXT 
IMPLICITTOP 
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TABLE 4 
Descriptive Statistics" 
Mean Std. 
0.38 NA 
NA NA 
NA NA 
1.00 NA 
1.00 NA 
0.92 NA 
1.19 NA 
0.39 0.02 
NA NA 
NA NA 
1.00 NA 
1.00 0.00 
0.93 0.00 
1.15 0.01 
0.38 0.14 
NA NA 
NA NA 
0.33 0.58 
1.00 0.00 
0.40 0.44 
1.27 0.23 
0.38 0.10 
0.61 0.12 
0.61 0.12 
0.48 0.51 
0.22 0.42 
0.41 0.40 
1.15 0.18 
0.36 0.25 
0.83 1.14 
0.94 1.08 
0.05 0.23 
0.11 0.31 
0.36 0.37 
1.13 0.14 
0.31 0.23 
073 0.55 
1.08 1.77 


Median 
0.38 0.38 
NA NA 
NA NA 
1.00 1.00 
1.00 1.00 
0.92 0.92 
1.19 1.19 
0.38 0.39 
NA NA 
NA NA 
1.00 1.00 
1.00 1.00 
0.92 0.93 
1.14 1.15 
0.25 0.38 
NA NA 
NA NA 
0.00 0.00 
1.00 1.00 
0.13 0.16 
1.12 1.15 
0.34 0.38 
0.48 0.63 
0.48 0.63 
0.00 0.00 
0.00 0.00 
0.10 0.24 
1.08 1.10 
0,23 0.30 
0.24 0.29 
0.40 0.46 
0.00 0.00 
0.00 0.00 
0.10 0.16 
1.07 1.12 
0.17 0.25 
0.33 0.60 
0.44 0.60 
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1.19 


0.40 


0.67 
1.17 


0.40 
0.86 
0.86 


(continued on next page) 
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TABLE 4 (continued) 





ES Mean Std. Q1 Median Q3 
TOPMANAGER 85 0.00 0.00 0.00 0.00 0.00 
DIVMANAGER 85 0.05 0.21 0.00 0.00 0.00 
RSALES 74 0.20 0.28 0.03 i 0.10 0.21 
SALESGROWTH 74 1.26 0.86 1.06 1.12 1.19 
CATEGORY = 8 
TB 203 0.26 0.20 0.15 ' 0.21 0.29 
IMPLICITNEXT 54 0.43 0.18 0.30 0.45 0.56 
IMPLICITTOP 54 0.71 0.23 0.53 0.64 0.93 
TOPMANAGER 208 0.00 0.00 0.00 © 0.00 0.00 
DIVMANAGER 208 0.04 0.20 0.00 0.00 0.00 
RSALES 174 0.25 0.33 0.04 0.12 0.26 
SALESGROWTH 174 1.14 0.18 1.06 1.11 1.17 
CATEGORY = 9 
ТВ 323 0.20 0.12 0.11 ‚ 046 0.23 
IMPLICITNEXT 105 0.41 0.15 0.28 * 041 0.49 
IMPLICITTOP 105 0.64 0.15 0.53 0.65 0.71 
ТОРМАМАСЕК 331 0.00 0.00 0.00 0.00 0.00 
DIVMANAGER 331 0.01 0.09 0.00 0.00 0.00 
RSALES 279 0.19 0.26 0.04 0.10 0.20 
SALESGROWTH 279 1.13 0.19 1.04 1.12 1.18 
CATEGORY = 10 | ' 
TB 404 0.18 0.09 0.11 0.14 0.20 
IMPLICITNEXT 179 0.38 0.08 0.33 0.37 0.41 
IMPLICITTOP 179 0.58 0.09 0.55 ‚ 0.58 0.62 
TOPMANAGER 425 0.00 0.00 0.00 0.00 0.00 
DIVMANAGER 425 0.00 0.07 0.00 0.00 0.00 
RSALES 334 0.26 0.34 0.03 . 0.12 0.26 
SALESGROWTH 334 1.16 0.72 1.07 1.12 1.19 


ә 


See Table 3 for variable definitions. The descriptive statistics are calculated for the pooled sample across countries, 
excluding 18 expatriate observations. 


determines the implicit incentives. On the other hand, managers who аге further down in the 
organization face lower compensation increases by moving up the initial levels, which are 
weighted most heavily in the computation of the option value. 

The variable TOPMANAGER indicates whether a manager is among the highest-ranking 
managers in the respective country organization. Table 1 shows that the highest-level managers are 
concentrated in job-rating categories 2 through 6. Similarly, the descriptive statistics for DIVMAN- 
AGER indicate that the proportion of managers who are in charge of a local division is higher in 
the higher-level job-rating categories. 

RSALES captures the relative importance of the division or subdivision with which a manager 
is affiliated, measured by the ratio of the unit's sales divided by the total sales of the respective 
country. As one would expect, the summary statistics for RSALES largely indicate that managers 
in lower job-rating categories are affiliated with smaller units. The median yalues for RSALES are 
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around 0.11 for categories 7 through 10, around 0.20 for categories 4 through 6, and over 0.90 for 
categories 2 and 3. 

The median values for SALESGROWTH, which are measured by the ratio of budgeted sales 
for 2008 to actual sales for 2007 for the respective unit, largely indicate an upward trend as 
managers move to higher-level job categories. Managers in categories 2 through 4 are in charge of 
units that experience growth between 15 percent and 20 percent; in contrast, the expected unit 
growth for managers in categories 5 through 10 is around 11 percent. This finding is consistent 
with the notion that units that face higher growth opportunities are managed by higher-level 
managers who are delegated more decision-making authority. 


IV. ANALYSES AND RESULTS 
Strength of Implicit Incentives 


The aim of this study is to analyze whether explicit incentives that are provided by variable- 
pay-based schemes are adjusted, based on the level of implicit incentives that are provided by the 
possibility of moving to a higher-level position in the hierarchy. The overall strategy that I use to 
address this question is to compare the explicit incentives of the local managers in the different 
countries, Given the organizational structure of the company, managers in the same job-rating 
category can reasonably be assumed to hold fairly comparable positions across countries. How- 
ever, there is likely to be substantial variation in the strength of the implicit incentives that 
managers in different countries face due to the following reasons. 

First, as can be observed from Table 1, the hierarchical structures differ across countries. 
Specifically, the countries vary with respect to the job category of the highest-ranking manager. In 
interviews with the author, the company contact emphasized that these differences are largely 
attributable to differences in the sizes of the units in the individual countries. Managers who have 
more organizational levels left “to climb" have stronger implicit incentives, all else equal. 

Second, variation in the pay structures across job-rating categories and variation in the prob- 
abilities of getting promoted in the different countries can also result in differences in the implicit 
incentives that the managers face. Managers who face larger compensation increases upon being 
promoted and who have higher chances of getting promoted face stronger implicit incentives, 
ceteris paribus. 

In order for the managers’ implicit incentives to be determined by features of their local 
country organizations, their career paths inside this company have to be confined to their respec- 
tive countries. This assumption seems reasonable for this setting for the following reason. The 
company pursues a strategy of being “multi-domestic,” in the sense that the individual country 
organizations and their managers are expected to be very familiar with the local markets, govern- 
ments, and infrastructures in order to be able to respond quickly to changes in the local environ- 
ments. Part of this strategy is to employ local managers. Inspection of the data reveals that, aside 
from the 18 expatriates discussed above, the vast majority of managers are local. The company's 
use of predominantly domestic managers is consistent with other examples in the literature (Brick- 
ley et al. 2009). Thus, the individual country organizations appear to be fairly segregated, making 
it reasonable to assume that the majority of the managers in the sample do not move between 
countries. 

Table 5 reports the correlation among the variables used in the analysis. TB, which captures 
the strength of the explicit incentives, is significantly positively correlated with the job-rating 
category of the highest-ranking manager in the respective country, but is not significantly corre- 
lated with either of the measures capturing the strength of the implicit incentives (JMPLICIT- 
NEXT, IMPLICITTOP). With respect to these preliminary findings, one should keep in mind that 
it is likely that incentive compensation practices systematically vary across countries for unob- 
served reasons. Ав expected, there is a significant correlation between ТВ and measures that 
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indicate that the manager has a high rank in the respective organization (CATEGORY, TOPMAN- 
AGER, DIVMANAGER). 

The two measures capturing the strength of the implicit incentives, IMPLICITNEXT and 
IMPLICITTOP, axe highly correlated with each other; they are also significantly correlated with 
measures that indicate that the manager has a high rank in the respective organization (CAT- 
EGORY, DIVMANAGER). The latter finding is consistent with the picture that emerges from the 
descriptive statistics in Table 3. Namely, the average values for the strength of the implicit incen- 
tives are increasing as one moves up the organizational ladder. 

The hypothesis predicts that the explicit incentives provided by the company's compensation 
scheme are decreasing in the strength of the promotion-based implicit incentives that the manager 
faces. In order to investigate the hypothesis, I employ 78, which captures the ratio of the expected 
bonus to base salary, as the dependent variable. As discussed above, I employ several measures to 
capture the strength of the implicit incentives that the manager faces. Specifically, the different 
measures capturing the strength of the implicit incentives are the job-rating category of the 
highest-ranking manager in a respective country (HIGHESTPOSITION) and measures that take 
compensation differentials between adjacent job-rating categories as well as promotion possibili- 
ties into consideration (IMPLICITNEXT, IMPLICITTOP). 

As mentioned above, it is plausible that incentive compensation practices vary systematically 
across countries for unobserved reasons. In order to control for such country-specific differences, 
I subtract from TB a measure that captures the median incentive intensity for mid-level managers 
for a large number of companies in the respective country. Specifically, the measure, which was 
provided by the company's consulting company, captures the median incentive intensity for man- 
agers who hold comparable positions in industrial corporations. The managers hold positions that 
are comparable to jobs that are ranked at level 10 in the company that is studied here. The 
consulting company computed the median incentive intensity using data for 30 to 150 companies 
in a given country. In order to control for systematic differences in the level of growth in the 
different countries, I adjust the variable SALESGROWTH for the respective country's growth in 
GDP from 2007 to 2008. 

Since the sample includes measures that vary at the country level but that are constant for all 
managers within a country as well as measures that vary across managers, the structure of the data 
set can be described as hierarchical, with managers nested in countries considered lower in the 
hierarchy than countries (Bryk and Raudenbush 1992). Specifically, the job rating for the highest- 
ranking manager in a country, which is captured by the variable HIGHESTPOSITION, varies only 
at the country level.!* Expressed in terms of the manager and country level of analysis, the model 
is as follows:'? 


Level 1: 


ТВ, = Boj + B\(IMPLICITNEXT,, от IMPLICITTOP;)) + ByCATEGORY 
+ B3Log(RSALES;;) + BsSALESGROWTH,, + В;ТОРМАМАСЕКІ; 
+ В ОГУМАМАСЕК, + ғ); 
for manager i in country j, where =; ~ МО, 07). 
Level 2: 


Boj шу yy HIGHESTPOSITION; + Uo; 


12 See Anderson et al. (2000) for an application of hierarchical linear models in the accounting literature. 
13 Because the distribution of RSALES is skewed, I use a log-transformation. 
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for country j, where uo; ~ МО, т). 


Combined in expressed form, the model is as follows: 


TB, = Y + y HIGHESTPOSITION; + В! IMPLICITNEXT,IMPLICITTOP;,) 
+ |, САТЕСОКУ у + B3Log(RSALES;)) + BySALESGROWTH;; + 8sTOPMANAGER;; 
+ BsDIVMANAGER; + uo; ғу (1) 


Table 6 shows the results of the estimation of Equation (1). I first estimate the model using the 
job-rating category of the highest-ranking manager in the respective country as the measure 
capturing the strength of the implicit incentives (Model I). The hypothesis predicts that the coef- 
ficient on HIGHESTPOSITION, f, is significantly positive. After controlling for the manager’s 
job-rating category, the managers’ implicit incentives are expected to be lower when the highest- 
ranking manager in the respective country has a lower job-rating category. The results indicate 
that, consistent with the prediction, the strength of the explicit incentives provided bv the compa- 
ny's bonus plan is higher for managers who have fewer organizational levels left to climb. 

In particular, the coefficient on HIGHESTPOSITION has a value of 0.025 and is significant at 
the 5 percent level, indicating that a manager's expected bonus decreases, on average, by approxi- 
mately 2.5 percent of salary for each hierarchical level that s/he climbs. For example, a manager 
who has tbree organizational levels left to climb has an expected bonus that is lower by 2.5 percent 
of salary than the expected bonus of a manager holding a similar position but who has only two 
organizational levels left to climb. Given that the mean (median) values of the salary-scaled 
expected bonus are 0.36 (0.30) for managers who are at hierarchical level 6, the difference appears 
to be economically meaningful. 

Although the findings with respect to the variable HIGHESTPOSITION are consistent with 
the hypothesis, they may be attributable to alternative explanations. Specifically, it is possible that 
the performance measures used in the bonus plan better reflect the actions of the managers who are 
closer to the top of their organization. Moreover, it is conceivable that managers who have fewer 
organizational levels left to climb have more decision-making authority that is not fully captured 
by the manager's job-rating category. 

Model II is estimated by using IMPLICITNEXT, which is intended to capture the implicit 
incentives that the manager faces when focusing on the next-higher job-rating category. I predict 
a negative coefficient on IMPLICITNEXT. The explicit incentives that are provided by the com- 
pany's bonus plan are expected to be lower when the compensation differential and/or the prob- 
ability of getting promoted, which are captured by IMPLICITNEXT, аге higher. Тһе results ob- 
tained from estimating Model II are consistent with this prediction. Specific ally, the coefficient on 
IMPLICITNEXT has a value of —0.079 and is significant at the 5 percent level. For example, if the 
median tenure is shorter by one year for managers at level 9, the value of IMPLICITNEXT 
increases by 0.29, which translates into an expected bonus that is lower by 2.2 percent of salary, 
on average. Similarly, a 50 percent increase in the median expected cash pay at the next level for 
managers in category 9 translates into an expected bonus that is lower by 1.8 percent of salary, on 
average. 

Model III is estimated by using JMPLICITTOP, which is intended to capture the implicit 
incentives that the manager faces when considering the entire organizational ladder. The prediction 
and findings are consistent with the findings for Model II. In particular, the coefficient on IM- 
PLICITTOP has a value of —0.047 and is significant at the 5 percent level. For example, if the 
median tenure is shorter by one year for managers at level 9, the value of IMPLICITTOP increases 
by 0.43, which translates into an expected bonus that is lower by 2.0 percent of salary, on average. 


Similarly, a 50 percent increase in the median expected cash pay at the next level for managers in 
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. TABLE 6 
Effect of the Strength of Implicit Incentives on Explicit Incentives? 
ТВ; = Yo + y;HIGHESTPOSITION; + B,(IMPLICITNEXT; or ІМРІЛСІТТОР;) 
+ В;САТЕСОКУ, + B4Log(RSALES;;) + PaSALESGROWTH;; 
Model I Model II Model ІП 
Coefficient Coefficient Coefficient 
Variable" (z-statistic) (z-statistic) (z-statistic) 
Intercept 0.243 *** 0.605 *** 0.577 *** 
(4.40) (2.93) (2.75) 
HIGHESTPOSITION (+) 0.025 ** 
(2.10) 
IMPLICITNEXT (-) —0.079 ** 
(—2.45) 
IMPLICITTOP (-) —0.047 ** 
(—2.11) 
CATEGORY (-) —0.033 *** —0.049 *** —0.046 *** 
(—5.85) (-4.42) (—4.30) 
Log(RSALES) (*) —0.001 —0.010 —0.012 * 
(—1.63) (—1.42) (71.73) 
SALESGROWTH (+/—) —0.019 —0.075 —0.076 
(-1.40) (-1.22) (—1.16) 
TOPMANAGER (+) 0.040 ** МА МА 
(2.37) 
DIVMANAGER (+) 0.026 0.172 0.183 
(1.52) (1.25) (1.26) 
S.D. of Intercept (мо) 0.071 *** 0.148 *** 0.140 *** 
. Number of observations 877 251 257 
Pseudo R? 39.8596 22.65% 23.73% 


ж жж ЖЖЖ Significant at 10 percent, 5 percent, and 1 percent, respectively, all based on two-tailed tests. 


ы 


parentheses. The models are estimated using standard errors that are clustered by country. 


c 


Expatriates are excluded from the regression models. Reported are the coefficients from the models with z-statistics in 


TB is calculated by subtracting the median incentive intensity in the respective country from ТВ as defined in Table 3; 


SALESGROWTH is calculated by subtracting the respective country's growth in GDP from SALESGROWTH as defined 
in Table 3; Log(RSALES) is natural logarithm of RSALES. The remaining variables are defined in Table 3. 


category 9 translates into an expected bonus that is lower by 1.0 percent of salary, on average. 
The findings with respect to the variables ІМРИСІТМЕХТ and IMPLICITTOP are consistent 
with the hypothesis. Nevertheless, it is possible that alternative explanations influence the findings. 
For example, it is possible that managers who face only small compensation increases upon 
promotion to the next level have relatively more decision-making authority. 
With respect to the control variables in Table 6, the coefficient on the manager's job-rating 
category has a statistically significant value of around —0.03 to —0.04 across the three specifica- 
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tions. The findings indicate that a manager's expected bonus increases by 3—4 percent of salary, on 
average, when s/he moves to the next job-rating category. In Model І, the coefficient on the 
indicator variable, which indicates whether the manager is among the highest-ranking managers, is 
significantly positive, indicating that top-ranking managers receive higher explicit incentives. Spe- 
cifically, the coefficient on TOPMANAGER has a value of 0.040, which suggests that top-ranking 
managers have expected bonus payments that are higher by 4 percent of salary, on average. The 
coefficients on the remaining control variables are insignificant at conventional levels. Specifi- 
cally, the coefficient on DIVMANAGER is not significant at conventional levels. A potential ex- 
planation is that the effect is subsumed by the variables TOPMANAGER and CATEGORY. 

In all three models, the standard deviation of the intercept is statistically significant. This 
finding indicates there is significant country-level variation around the intercept. Stated differently, 
there remains significant unexplained variation at the country-level. | 

Jt is conceivable that the results in Table 6 are driven by the number of organizational levels 
that the manager has left to climb. In other words, it is possible that the results in Models II and 
Ш with respect to the variables IMPLICITNEXT and IMPLICITTOP are driven by the manager's 
hierarchical level. In order to isolate the strength of the implicit incentives deriving from com- 
pensation increases and promotion probabilities, I re-estimate Models II and III from Table 6 after 
including the measure HIGHESTPOSITION, which controls for the number of organizational 
levels that the manager has left to climb. The results in Table 7 indicate that, even after controlling 
for the number of organizational levels that are left for the manager to c-imb, explicit incentives 
are stronger when promotion-based implicit incentives are weaker. 

In summary, the results in Table 6 and Table 7 indicate that the explicit incentives provided by 
the company's bonus plan are higher when the manager has fewer organizational levels left to 
climb, when s/he faces lower implicit incentives from moving to the next organizational level, and 
when s/he faces lower implicit incentives from moving to the top of the organization. These 
findings support the hypothesis that explicit incentives provided by variable pay schemes are 
stronger when promotion-based implicit incentives are weaker. In a broacer sense, the results are 
consistent with the notion that implicit promotion-based incentives are taken into consideration in 
designing explicit incentive contracts. 


Robustness Tests 

The inferences are robust to sensitivity tests. First, the results are robust to estimating an OLS 
regression on the pooled sample where the intercept is treated as non-random. Second, I repeat the 
analyses by including an indicator variable for each of the five different divisions. The results are 
virtually identical. Third, in order to control for the manager's career horizon (Gibbons and 
Murphy 1992), I include the manager's age as an independent variable in the regression analyses, 
which does not change the results. I also repeat the analyses using the ratio of the actual bonus that 
was paid out to base salary as the dependent variable in order to control Zor potential differences 
in the countries’ bonus payout practices. The inferences remain unchanged. Moreover, the infer- 
ences are robust to including the unit's sales as a measure of the unit’s! size іп the regression 
models. The results also remain unchanged when the variable capturing tzie manager's job-rating 
category, CATEGORY, is replaced by indicator variables in order to address potential nonlineari- 
ties. 


V. CONCLUSION 
This study re-examines the hypothesis that the explicit, compensation-based incentives of 
mid-level managers are adjusted to the level of implicit, promotion-based incentives. Specifically, 
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TABLE 7 
Effect of the Strength of Implicit Incentives on Explicit Incentives" 


ТВ; = Yo + Yy,HIGHESTPOSITION, + B(IMPLICITNEXT, or IMPLICITTOP;; 
+ B,CATEGORY,; + BLog(RSALES;) + BISALESGROWTH;; 


I п 
Coefficient Coefficient 
Variable? (z-statistic) (z-statistic) 
Intercept 0.420 *** 0.402 ** 
(4.51) (3.56) 
HIGHESTPOSITION (+) 0.031 0.031 
(0.95) (0.98) 
IMPLICITNEXT (-) —0.086 *** 
(—4.30) 
IMPLICITTOP (-) —0.057 ** 
(-2.18) 
CATEGORY (-) —0.046 *** —0.045 ##% 
(—5.01) (—4.62) 
Log(RSALES) (+) —0.003 —0.005 
(—0.27) (—0.48) 
SALESGROWTH (+/—) —0.060 —0.065 
(—0.92) (—0.93) 
TOPMANAGER (+) МА МА 
DIVMANAGER (*) 0.190 0.204 
(1.28) (1.31) 
S.D. of Intercept (uj) 0.130 *** 0.130 %%% 
Number of observations 257 257 
"Pseudo R2 27.04% 28.27% 


ж жж ЖЖЖ Significant at 10 percent, 5 percent, and 1 percent, respectively, all based on two-tailed tests. 


% Expatriates are excluded from the regression models. Reported are the coefficients from the models with z-statistics in 
parentheses. The models are estimated using standard errors that are clustered by country. 


> ТВ is calculated by subtracting the median incentive intensity in the respective country from TB as defined in Table 3; 
SALESGROWTH is calculated by subtracting the respective country's growth іп GDP from SALESGROWTH as defined 
in Table 3; Log(RSALES) is natural logarithm of RSALES. The remaining variables are defined in Table 3. 


this study revisits the theoretical argument that explicit incentives are optimally stronger in situ- 
ations that pose weaker implicit, promotion-based incentives (Gibbons and Murphy 1992; Gibbs 
1995). 

The analyses in this study are based on compensation data from a large multinational corpo- 
ration. This setting provides an opportunity to observe variation in the strength of implicit incen- 
tives because the sample is comprised of managers with comparable jobs but who face varying 
levels of implicit incentives since they are positioned at different hierarchical levels in their 
respective organization, their promotion possibilities vary, and they experience different levels of 
compensation increases upon promotion. 
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Regression analyses show that the incentives provided by the company’s bonus plan are 
stronger for managers who are positioned at higher hierarchical levels, who face weaker implicit 
incentives from getting promoted to the next level, and who face weaker implicit incentives from 
getting promoted to the top of the organization, after controlling for the position’ S scope and level 
of accountability. These findings are consistent with the notion that implicit promotion-based 
incentives are taken into consideration in designing explicit incentive contracts, as proposed in the 
theoretical literature. More precisely, the evidence supports the hypothesis that explicit incentives 
are optimally stronger in situations with weaker implicit incentives (Gitbons and Murphy 1992; 
Gibbs 1995). 
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ABSTRACT: This study examines the influence of organizational controls related to 
knowledge management and resource development on assimilation (i.e., strategic in- 
tegration and use) of business intelligence (BI) systems. Bl systems use analytics and 
performance management concepts to leverage enterprise system databases and pro- 
vide core management control system (MCS) capability. Our results indicate that orga- 
nizational absorptive capacity (i.e., the ability to gather, absorb, and strategically lever- 
age new external information) is critical to establishing appropriate technology 
infrastructure and to assimilating Bl systems for organizational benefit. Further, findings 
show that while top management plays a significant role in effective deployment of Bl 
systems, their impact is indirect and a function of operational managers' absorptive 
capacity. In particular, this indirect effect suggests that leveraging ВІ systems is driven 
from the bottom up as opposed to the top down. This differentiates ВІ from other 
isolated strategic MCS innovations that have traditionally been viewed as top- 
management driven. 
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Data Availability: Data are available through the first author, subject to limitations set 
by the Faculty Human Ethics Advisory Group and the vendor provid- 
ing customer information. 


` I. INTRODUCTION 

nterprise systems (i.e., integrated systems such as enterprise resource planning [ERP]) are 

fundamentally tied to the work of accounting. Enterprise systems have transformative 

implications for the integration of organizational information and zontrol systems govern- 
ing organizations (Chapman 2005; Chapman and Kihn 2009). Unfortunately, research indicates 
that most organizations' management control systems (MCS') have not leveraged this potential 
(e.g., Granlund and Malmi 2002; Dechow and Mouritsen 2005; Quattrone and Hopper 2005; Rom 
and Rohde 2007), but have simply used these systems to support existing MCS (Chapman and 
Kihn 2009). Enterprise systems are simply a resource that is made available and must be effec- 
tively leveraged for MCS enhancements to actually accrue (Chapman anzi Kihn 2009). 

Business intelligence (BI) systems are widely viewed as the innovation that can leverage the 
wealth of data encapsulated in an enterprise system and support the anticipated transformation to 
a broader and more detailed MCS (Brignall and Ballantine 2004; Williams 2004; Carte et al. 2005; 
Kay 2006; Gnatovich 2007; Robertson et al. 2007). BI systems provide business analytics and 
corporate performance management reporting capabilities, the fundamental information for an 
effective MCS in a technology-driven business environment.? Davila and Foster (2007) identify 46 
specific categories of MCS. The extensive set of pre-built reports and metrics included in most BI 
systems provide support for all of these categories.’ The purpose of this study is to examine the 
deployment of BI systems (an emerging MCS innovation) in order to better understand how an 
organization’s controls related to knowledge management and resource development can lead to 
higher levels of BI assimilation. 

Prior research suggests that integration of the enterprise system and MCS is heavily depen- 
dent on an organization’s people, such that both the social and technical aspects of organizations 
should be considered in future research (Chapman and Kihn 2009). This view is consistent with 
Chenhall (2005), who argues that future research on MCS adoption needs to specifically consider 
how cultural controls influence implementation of an MCS innovation. Simon (1995, 34) defines 
value (i.e., cultural) controls as the “explicit set of organizational definitions that senior managers 
communicate formally and reinforce systematically to provide basic values, purpose, and direc- 


1 We refer to MCS as the formal, information-based routines and procedures that provide managers with measures, 
performance indicators, and procedures to maintain or alter patterns in the organizational activities to ensure that they 
are consistent with the organizational objectives and strategies (Simon 1995; Malmi and Brown 2008). 

Business Analytics provide the functionality to extract enterprise data and create a broad range of balanced scorecards, 
key performance indicators, total quality management metrics, and activity-based costing. Primarily they are associated 
with cybernetic controls while also providing some support for compensation/reward controls and administrative con- 
trols. Corporate Performance Management (CPM) capabilities provide support for budgeting, financial planning, con- 
solidation, and enterprise planning. CPM also provides predefined reports for ABC, scorecards, and budgets. Thus, CPM 
primarily extends across planning and cybernetic controls (Howard 2003; Williams and. Williams 2007). 

The specific BI system used by the organizations that we examine includes three broad applications: financial, supply 

- chain, and customer analyses. Financial analysis focuses on accounts receivable (customer credit, performance, cash 
inflow, and organizational effectiveness), general ledger (financial ratio reporting and analysis, financial performance, 
organizational performance, and budget analysis), and accounts payable (cash outflow. performance, vendor account 
analysis, and organizational effectiveness). Supply chain analysis focuses on inventory (stock overview and valuation, 
inventory forecasting, inventory demand, material movement activity, physical inventory, material reservations, and 
organizational effectiveness), procurement (material expenditure, procurement vendor analysis, material demand, pro- 
cess effectiveness, and organizational effectiveness) and the accounts payable analysis skared with the financial analysis 
application. Customer analysis includes sales (functional performance, customer sales, product sales, channel sales, 
distribution functional performance, sales organizational effectiveness, and distribution ozzanizational effectiveness) and 
accounts receivable that is shared with financial analysis (Howard 2003). 
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tion." Using that definition, we focus on one specific aspect of cultural control—that of control 
over knowledge management and resource development. The resulting culture determines the 
identification, absorption, and strategic application of new external information (hereafter, the 
level of absorptive capacity). Our focus includes the influence of the top management team (TMT) 
on the culture present at the operational manager level. By studying the level of absorptive 
capacity, we shed light on the interrelationship between cultural controls and deployment of an 
integrated MCS—i.e., MCS as a package (Malmi and Brown 2008). 

The study of MCS as a package addresses another concern in the MCS innovation literature— 
the tendency to study MCS innovations in isolation. Chenhall (2003, 131) notes that MCS inno- 
vation studies are generally done in isolation. If specific accounting controls are systematically 
linked with other organizational controls, then research ignoring these connections may report 
spurious findings. BI systems provide a unique type of MCS innovation, in that they are not 
designed to support a single aspect of control (e.g., activity-based costing or balanced scorecards). 
Rather, BI systems provide powerful extraction capability to supplement substantial expansion of 
both planning and cybernetic controls, while also providing support for administrative and reward/ 
compensation controls. In our study, we focus on organizations' deployment of one ВІ system 
that is specifically designed to enhance MCS capability. At implementation, the system provides 
pre-developed metrics that can be configured to connect to the underlying databases of most 
leading ERP vendors. These metrics provide access to over 200 different pre-built reports using 
more than 500 key performance indicators and analytics answering over 2,900 business critical 
questions. The metrics include a broad array of performance measures for sales analysis, financial 
analysis, inventory and procurement analysis, and supply chain analysis—including a multitude of 
scorecard analyses (Howard 2003). Thus, amid concerns that MCS innovations have been studied 
in isolation from the overall package of MCS systems present in organizations (Chenhall 2005; 
Davila and Foster 2007; Malmi and Brown 2008), BI systems allow us to examine deployment of 
an integrated MCS. 

We also address two general limitations that have been raised in regards to MCS adoption 
studies. First is the concern that studies have examined whether firms have adopted MCS, but 
those studies have not explored the variation in quality or depth of the use of MCS following that 
adoption (Davila and Foster 2007). Our research extends beyond adoption to assimilation, which 
includes the scope, use, and strategic integration of a system (Chatterjee et al. 2002). Second is the 
difficulty in moving beyond innovative case-based studies of MCS that can have limited generality 
(Davila and Foster 2005). Garnering access to multiple implementers of an innovation is generally 
quite difficult. Even when access may be possible, the variation in implementations often limits 
comparability. We avoided these problems by approaching one of the major BI vendors and 
securing their full client list. As a result, we acquire responses from a diverse sample of organi- 
zations implementing a common BI system. 


4 Chenhall (2003) differentiates MCS from management accounting systems (MAS) by noting the latter is a subcompo- 
nent of the former. Chenhall (2003) defines the MAS as the systematic use of management accounting practices such as 
budgeting or product costing. MCS is broader and encompasses MAS as well as cultural and administrative controls. 
When we refer to core MCS capability, MAS capability is the primary interest that is encompassed primarily by 
cybernetic and planning controls (Malmi and Brown 2008). Malmi and Brown (2008) define the various categories of 
control as follows: culture controls establish the values, beliefs, and social norms that influence employees’ behavior; 
cybernetic controls provide quantitative measures for activities and processes, set performance standards, provide 
feedback processes, assess variance between goals and accomplishments, and influence individual behavior; planning 
controls set functional area goals, establish standards for assessing accomplishment, and assure alignment of goals 
across the functional areas of the organization; reward/compensation controls motivate and increase the performance of 
individuals; and administrative controls establish organizational structures, lines of accountability, and establish policies 
and procedures. 
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This study examines the role of top-management's and operational-level-management's ab- 
Sorptive capacity in facilitating assimilation of BI into the overall MCS. We develop an integrative 
model that theorizes relationships between the strategic and operational lzvels of absorptive ca- 
pacity, and the impact on the existing IT infrastructure in order to improve our understanding of 
how organizations assimilate BI. We test the model using data collected from 347 business units 
that implemented the BI software. Our results indicate that organizational absorptive capacity is 
fundamental to both the readiness of the technology infrastructure for su»porting BI integration 
and the successful assimilation into the overall MCS. 

This study offers several contributions to the MCS literature. First, we examine the influence 
of the overall MCS on the adoption and strategic integration of BI, an MCS innovation. This is 
consistent with calls that advocate that MCS should be considered as a package, allowing MCS 
innovations to be examined in tbeir context rather than in isolation. Second, in examining the 
cultural control aspect, we provide clarity to the role of TMT in assimilation of MCS innovation. 
By incorporating absorptive capacity, the result of cultural controls over kiowledge management, 
we are able to show the critical intermediary role of operational management's absorptive capacity 
on assimilation of an MCS innovation. Third, our study moves beyond adoption of an MCS to 
consider the depth of strategic integration. By examining the assimilation of BI systems, we attain 
a better understanding of the strategic benefit that is derived from an MCS innovation. Fourth, we 
are able to establish that these relationships are consistent across a diverse range of organizations 
by examining one common BI system. 

This study also contributes to the accounting information systems literature. First, for 30 years 
researchers have been focused on the normative development of integrated databases that unify 
enterprise-wide business event data (McCarthy 1982). This study provides empirical evidence of 
the benefit such integrated databases have for effective MCS. Second, our study provides insights 
into the strategic benefit that can accrue from standardized software extracting data from enter- 
prise databases and providing both pre-specified and dynamic reporting capability. Third, we 
expand the general IT literature on the relationship between absorptive capacity and IT assimila- 
tion by focusing not only on the static knowledge component of absorptive capacity, but also the 
knowledge creation component. Our results indicate this latter component is critical to the assimi- 
lation of complex strategic systems such as BI. 

Section II draws from theory on systems assimilation and the dynamic perspective of knowl- 
edge creation to develop a model of BI assimilation. Section III details the research methods, 
operationalization of constructs, and data collection process. Section IV details the results, and 
Section V discusses the implications. 


П. THEORY AND HYPOTHESES 

The adoption of enterprise systems and the resulting impact on MCS have been studied with 
great anticipation, given the potential for real benefits from integrated enterprise-wide data. En- 
terprise systems are expected to automate mundane MCS tasks and provide opportunity for 
broader-based MCS. These extended MCS are expected to enhance both management's strategic 
analyses and operational-level analyses as well as improve decision making (Sutton 2000; 
Granlund and Malmi 2002; Chapman 2005; Dechow and Mouritsen 2005; Quattrone and Hopper 
2005; Arnold 2006). While this transformation is prevalent to some limited degree in some 
organizations (e.g., Granlund and Malmi 2002; Caglio 2003; Scapens and Jazayeri 2003; Quat- 
trone and Hopper 2005), the general consensus is that little evolution in MCS has occurred. 
Rather, enterprise-wide data are often simply extracted for use by the same MCS modules that 
existed prior to implementation (Granlund and Malmi 2002; Dechow and Mouritsen 2005; 
Rikhardsson and Kraemmergaard 2006; Rom and Rohde 2007). Simply deploying enterprise 
systems appears insufficient to achieve significant enhancement of organizations’ MCS (Chapman 
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and Kihn 2009). The professional literature similarly espouses the inadequacy of enterprise sys- 
tems for facilitating the reporting and analyses necessary to support advanced MCS (e.g., Williams 
2004; Kay 2006; Gnatovich 2007; Robertson et al. 2007; Williams 2008). 

Enterprise systems provide highly integrative databases, but the ability for the average user to 
extract relevant data without the aid of specialized applications is limited. BI systems are designed 
to facilitate users in conducting detailed analyses of data contained in enterprise databases (Brig- 
nall and Ballantine 2004; Carte et al. 2005; Robertson et al. 2007; Williams 2008). BI systems 
hook into the underlying databases created by enterprise systems and provide a broad array of 
pre-specified reports and business analytics that form the information infrastructure necessary to 
support enhanced MCS capability. 

By definition, assimilation studies are post-adoption studies that assume the decision to adopt 
a technology has been made, and acceptance and diffusion of the system is complete (Chatterjee 
et al. 2002). At an assimilation level, the interest is in whether a strategic technology has been 
integrated and used in a manner that provides strategic benefit, which is the focus of this study. 
Figure 1 provides an overview of our conceptual model. The model focuses on three critical 
organizational factors theorized to affect BI assimilation: TMT’s absorptive capacity, operational 
managers' absorptive capacity, and sophistication of IT infrastructure. Ап organization's absorp- 
tive capacity is represented by its ability to recognize the value of new, external information, 
absorb it, and apply it for commercial ends (Cohen and Levinthal 1990). This focus on the 
absorptive capacity (Cohen and Levinthal 1990; Zahra and George 2002) of an organization's 
top-level and operational management is consistent with the views put forth by Chapman and Kihn 
(2009). This focus entails inclusion of the critical organizational components necessary for suc- 
cessfully leveraging enterprise systems. The joint effects of TMT and operational managers’ 
absorptive capacity capture the effectiveness of cultural controls related to knowledge manage- 


ment. 
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Cultural Controls and the Development of Absorptive Capacity 


Cultural controls relate to setting and instilling a set of values, beliefs, and social norms that 
are shared by an organization's members (Malmi and Brown 2008). Cultural controls encapsulate 
what Simons (1995, 34) identifies as "value" controls, and that he defines as "the explicit set of 
organizational definitions that senior managers communicate formally and reinforce systematically 
to provide basic values, purpose, and direction for the organization.” Ѕітпопѕ (1995, 34) notes 
further that these organizational definitions specify the “values and directions that senior managers 
want subordinates to adopt." Cultural controls play an important role in a broader view of MCS in 
that they set the boundaries and direction for core MCS capability (e.g., planning, cybernetic, and 
reward/compensation controls) and influence the role MCS plays in the organization (Chenhall 
2003; Malmi and Brown 2008). 

TMT represent a small group of the most influential executives with overall responsibility for 
an organization (Hambrick and Mason 1984). While MCS studies often highlight the critical role 
of TMT (Anderson and Young 1999; Bhimani 2003; Davila and Foster 2005; Naranjo-Gill and 
Hartman 2006, 2007; Davila and Foster 2007), TMT's knowledge has received limited attention in 
the MCS literature. TMT's knowledge is considered a primary indicator of TMT's competence 
(Armstrong and Sambamurthy 1999; Bassellier et al. 2003), yet research has failed to link TMT 
knowledge and assimilation (e.g., Armstrong and Sambamurthy 1999). We explore this phenom- 
enon with the belief that prior studies, which have focused on a static view’ of TMT’s knowledge, 
fail to capture the knowledge-creation activities that should drive technology innovation. 

TMT knowledge from a dynamic view of organizational knowledge should provide a much 
better encapsulation of TMT’s knowledge capabilities. In turn, a dynamic view should better 
capture the underlying theoretical reasons for the link between TMT ard BI assimilation. An 
absorptive-capacity view focuses on the synergies of both TMT’s knowledge and TMT's ability to 
put that knowledge into practice. From a theory perspective, TMT’s absorptive capacity should be 
the key determinant of TMT's ability to provide effective leadership and support increased ab- 
sorptive capacity at all levels of the organization. TM T's absorptive capacity is a key element for 
cultural controls to be effective in promoting organizational absorptive capacity. 

Within an MCS context, operational managers can play a key role in driving the use of BI 
systems to support an organization's MCS needs. The effect of TMT’s absorptive capacity on 
assimilation is therefore best conceptualized as indirect, mediated by operational management's 
absorptive capacity. Case-based evidence of the phenomenon can be interpreted from Caglio 
(2003), where the CFO became the driving force behind the design anc implementation of the 
enterprise system. In the course of taking this lead, the CFO empowered the management accoun- 
tants to be retrained to take on more of an analytic role for the TMT аш! a consulting role to a 
broad range of functional areas. Hence, the CFO's quick adaptation to tbe enterprise system and 
his rapid acquisition of relevant knowledge and capabilities allowed him to be an effective force 
in the implementation. However, the actual individuals implementing the changes that facilitated 


The static view of organizational knowledge perceives organizations as a stock of knowledge that an organization 
possesses (Nonaka 1994; Nonaka and Takeuchi 1995; Grant 1996; Cook and Brown 1929). Most assimilation studies 
that have drawn on the knowledge-based view follow the static view and capture the amount of knowledge that people 
possess. Cook and Brown (1999), among many scholars, criticize research on organizational knowledge which follows 
the static view (Nonaka 1994; Kim 1998; Nonaka and Toyama 2003). Cook and Brown (1999) refer to this traditional 
understanding of organizational knowledge as the epistemology of possession because organizational knowledge is 
treated as something people possess. 

The dynamic perspective of organizational knowledge is consistent with a large body of current knowledge management 
literature (Ditillo 2004; Vera-Munoz et al. 2006) that suggests understanding the organization's knowledge through its 
dynamic capabilities of knowledge creation rather than the stock of knowledge (static view) that organizations possess 
(Nonaka 1994; Grant 1996; Cook and Brown, 1999). 
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the transformation are the operational-level accountants acquiring the new knowledge to facilitate 
the transformation. The CFO was able to communicate his vision, providing a strong cultural 
control that helped transform behavior. This organization's experience is consistent with theoriza- 
tions that organizations implementing controls promoting organizational absorptive capacity are 
better able to overcome knowledge barriers and effectively utilize new systems (Attewell 1992; 
Fichman and Kemerer 1997; Ravichandran 2005). 

Cohen and Levinthal (1990), incorporating Attewell's (1992) views on organizational learning 
and innovation, posit that an organization's absorptive capacity facilitates learning and, in turn, 
drives innovation. As noted earlier, an organization's absorptive capacity is represented by its 
ability to recognize the value of new, external information, absorb it, and apply it for commercial 
ends (Cohen and Levinthal 1990)." Cohen and Levinthal (1990) suggest that prior relevant knowl- 
edge and intensity of effort are critical elements for developing effective absorptive capacity. 
Intensity of effort represents the energy that members of an organization commit to solving 
problems and creating new knowledge." 


Absorptive Capacity and BI Assimilation 


Successfully assimilating a BI system that automates MCS capabilities aligned with business 
strategies is dependent upon the development of relevant knowledge and skills (Fichman and 
Kemerer 1997; Armstrong and Sambamurthy 1999). The development of relevant knowledge and 
skills within an organization is highly dependent on the cultural controls management has put in 
place and supported (Malmi and Brown 2008). These cultural controls are used by organizations 
to integrate individuals' absorptive capabilities into the organization's routine and practice 
(Nonaka 1994; Kim 1998; Van den Bosch et al. 1999). Organizational absorptive capacity reflects 
the competence attributed to an organization's members (Szulanski 1996; Zahra and George 2002; 
Bassellier et al. 2003). 

Two levels of organizational absorptive capacity are critical to BI assimilation. TMT's absorp- 
tive capacity represents the collective ability of TMT members to recognize the value of new 
information gathered from both internal and external sources, absorb it, and apply it to support 
their leadership role in strategic planning and control. TMT's absorptive capacity is determined by 
the broader knowledge and expertise that its members possess as well as knowledge-creation 
activities, including interaction with external constituents, such as competitors, customers, and 
peers (Nambisan et al. 1999; Daghfous 2004). Operational-level managers' absorptive capacity 
refers to the ability of managers at the operational level to value new information, absorb it, and 
apply it to support the organization's business strategy and value chain activities. Absorptive 
capacity at the operational level is heavily influenced by cultural controls, including TMT inter- 


This definition of absorptive capacity suggests that the organization's absorptive capacity is built upon three capabilities: 
value, absorption, and application (Cohen and Levinthal 1990; Zahra and George 2002). (1) Value capability refers to 
the organization's ability to recognize new information, whether received from internal or external sources. This 
capability requires the organization to possess prior relevant knowledge and expertise that will help to value/assess new 
information. (2) Absorption capability refers to the organization's ability to analyze, process, interpret, and understand 
the new information. (3) Application capability refers to the ability of the organization to use the new information to 
support an organization's activities and strategies. The three capabilities are combinative in nature as they build upon 
each other to create the dynamic capability of the organization. This also suggests that an organization's absorptive 
capacity is path-dependent, as prior relevant knowledge must exist in order to facilitate absorption and use of new 
information. 

Kim (1998) suggests a matrix that describes the interaction between prior relevant knowledge and intensity of effort and 
the effect of that interaction on the organizational absorptive capacity. According to Kim (1998) the absorptive capacity 
of the organization will be at its highest level when both the intensity of effort (і.е., action) and prior relevant knowledge 
are high. The absorptive capacity will be at the lowest level when both prior relevant knowledge and knowledge-related 
effort are low. In this study we use the underlying concept articulated by this matrix to operationalize the absorptive 
capacity construct tested in our research model. 
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vention and focused knowledge-creation activities (Boynton et al. 1994; Fichman and Kemerer 
1997; Jansen et al. 2005). 

BI assimilation is largely focused on the support and integration of core MCS control pro- 
cesses (e.g., planning and cybernetic) that are of primary concern at the operational or line levels. 
These core control processes are generally thought of as the management accounting system 
(Chenhall 2003; Malmi and Brown 2008). We see parallels in the few reported successful enter- 
prise system integration studies. For instance, Caglio (2003) notes the empowerment of manage- 
ment accountants across operational and line functions with their ability to leverage enterprise- 
wide data for a broad range of functional areas. We posit that operational-level absorptive capacity 
is a critical precursor to BI assimilation, leading to H1: 


H1: Operational-level managers' absorptive capacity will positively enhance organizations' 
БІ assimilation. 


As noted earlier, TMT plays a key role in establishing the cultural controls that motivate and 
enable the development of absorptive capacity. TMT leadership roles can be viewed from two 
perspectives: external and internal (Ulrich and Wiersema 1989; Kakabadse et al. 1995). External 
leadership roles include the ability to interact with the changing environment and interpret this into 
internal vision (Daft and Weick 1984; Hambrick 1995). Internal leadership roles include the 
design and management of employees' actions that enable realization of ал organization's vision 
(Kakabadse et al. 1995; Anderson and Young 1999; Caglio 2003; Chenhall and Euske 2007). This 
includes provision of knowledge-creation mechanisms that address knowledge gaps at both the 
TMT and operational levels and are necessary to support new strategies and enabling technologies 
(Keen 1991; Nonaka et al. 1998; Caglio 2003). 

Identifying and remediating knowledge gaps is critical, as BI assimilation is dependent on 
operational-level managers understanding the full potential of BI systems. This requires 
operational-level managers to raise their IT literacy to a level conducive with effective deployment 
(Rikhardsson and Kraemmergaard 2006). This dependency on operationaldevel learning suggests 
the relation between TMT's absorptive capacity and BI assimilation is mediated by operational- 
level absorptive capacity. The second hypothesis is stated as: 


H2: TMT's absorptive capacity will positively enhance operational level-managers' absorp- 
tive capacity. 


H2a: TMT's absorptive capacity will positively enhance organizations' BI assimilation, 
through the operational-level managers' absorptive capacity. 


ТТ Infrastructure Sophistication and BI Assimilation 

IT infrastructure sophistication refers to "the extent to which an organization has diffused the 
key information technologies into its foundation for supporting business applications" (Armstrong 
and Sambamurthy 1999, 309). IT infrastructure sophistication reflects the diversity and integration 
of IT components necessary to support BI systems. BI is designed to leverage complex business 
data (Quattrone and Hopper 2005; Dechow and Mouritsen 2005) that are integrated with other 
business information in the creation of enterprise-wide databases (Granlund and Malmi 2002). 
Accordingly, we posit that organizations with the sophisticated IT infrastructure to support infor- 
mation systems integration and enterprise-wide data integration will be be:ter able to successfully 
assimilate BI systems. This reasoning leads to H3: 


НЗ: The sophistication of IT infrastructure will positively enhance an organization's BI 
assimilation. 
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Absorptive Capacity and IT Infrastructure Sophistication 


IT users at the operational level are important sources of IT innovation (von Hippel 1994; 
Nambisan et al. 1999). In an enterprise systems environment, operational-level managers are 
capable of driving systems configuration (Caglio 2003; Rikhardsson and Kraemmergaard 2006; 
Byrne and Pierce 2007; Elbashir and Williams 2007). The manner in which IT infrastructure is 
deployed can directly affect the information available to support MCS objectives (Quattrone and 
Hopper 2005). Supporting broad-based MCS capabilities necessitates sophisticated infrastructure 
configurations only possible through high levels of operational-level absorptive capacity (Dechow 
and Mouritsen 2005). This reasoning leads to H4: | 


H4: Operational-level managers’ absorptive capacity will positively enhance the IT infra- 
structure sophistication of the organization. 


TMT contributes to various IT infrastructure-related activities including project planning, 
resource allocation, and user problem solving. Building sophisticated IT infrastructure to support 
various applications requires a sound understanding of the organization's strategic objectives and 
the types of IT infrastructure services required to support those objectives (Broadbent et al. 1999). 
TMT with higher absorptive capacity will be better able to align organization-wide IT infrastruc- 
ture investments with business strategies. This reasoning leads to the following hypothesis: 


H5: TMT's absorptive capacity will positively enhance the IT infrastructure sophistication of 
the organization. 


Control Variables 


Prior studies suggest that two ancillary factors can influence assimilation: time since adoption 
(Anderson and Young 1999) and firm size (Davila and Foster 2005, 2007). Two proxies are used 
for firm size: Number of employees and gross revenue of the firm (Zhu and Kemerer 2002; 
Subramani 2004; Liang et al. 2007). These factors are modeled as control variables to isolate 
primary influences on BI assimilation. 


ІП. RESEARCH METHOD 

We use a field survey method to test the research hypotheses. The survey was distributed to 
Australian client organizations of a single international vendor providing BI software. The vendor 
provides a major BI system used internationally and that is specifically recognized for its packaged 
MCS capability. The system is configurable for leading ERP packages, enabling the BI system to 
directly access the underlying enterprise databases. There are over 200 different pre-defined stan- 
dard reports using over 500 key performance indicators (KPIs) and containing analytics to answer 
over 2900 business-critical questions. The metrics and reports support MCS capabilities for sales 
analysis, financial analysis, inventory and procurement analysis, and supply chain analysis, includ- 
ing a broad range of scorecards for assessing performance (Howard 2003). Focusing on a single 
software vendor controls for variation that may occur from differences in MCS capability across 
BI software options. 

We distributed surveys to 1,873 managers in 612 organizations that the BI vendor provided 
(subject to a written nondisclosure agreement) as a contact list of clients. Where possible, we 
selected multiple respondents for each organization from the vendor's contact list to include senior 
executives, operational managers, and IT users. For a “small organization" that only had a single 
contact person, the organization was selected if the contact person was a senior executive (e.g., 
chief executive officer [CEO], chief financial officer [CFO], or chief information officer [CIO]). A 
multiple-respondent strategy is preferred for the richness of the data, to mitigate bias, and to 
enhance accuracy (Sethi and King 1994; Huber and Power 1985). 
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The survey protocol followed the guidance of Dillman (2000). We mailed survey packets 
including a cover letter, survey, and pre-paid reply envelope to each selected contact. A first 
reminder was sent by email four weeks later to all the recipients. A second survey packet was sent 
four weeks after the email reminder to all nonrespondents. A final remirder was emailed two 
weeks later with a URL link to the web-based version of the survey. Online surveys are used in 
prior studies for both sole and supplementary survey methods (Dillman 2000). Our tests reveal no 
differences between paper and online responses. 

An average of three respondents in each targeted organization received the survey. We asked 
respondents from organizations with multiple strategic business units (SBUs) to choose whether to 
answer the survey on behalf of either a SBU or the whole enterprise. We received a total of 436 
responses from 229 organizations including 65 online responses. Due to missing data and selection 
of “Хо Basis for Answering" responses on the main study variables, 17 responses are deemed 
unusable. This resulted in 419 usable responses, for a usable response rate of 22 percent and 36 
percent for individual and organization response, respectively.” The responses included 135 choos- 
ing to identify and respond on behalf of their SBUs. Including responses for SBUs and responses 
for organizations as a whole, the final sample consists of 347 organizational units. 

The average age of respondents is 41.1 years, with 16.5 years of work experience. Respon- 
dents are 80 percent male and 20 percent female. Respondents classifying themselves as business 
executives/managers comprise 54 percent, with 46 percent IT executives/managers and 13 percent 
holding both business and IT jobs. Over half (54 percent) of respondents report five to eight years 
of BI systems experience, while 26 percent have over eight years of experience. Large organiza- 
tions are the predominant respondents, with an average of 663 employees and gross revenue 
exceeding AUD$2 billion. The sample is also diverse in industry representation, with the most 
common being manufacturing (65, 18.7 percent) and Retail/Wholesale/Distribution (50, 14.4 per- 
cent). Double-digit responses were also received in Banking/Finance/Insurance, Transport/ 
Logistics, | Media/Entertainment/Publishing, Healthcare, Telecom, ^ Agricultural/Mining/ 
Construction, and Consulting/Professional Services. 

To test the consistency of the responses, we compute the correlation of multiple responses 
from the same organization on the main constructs (see Armstrong and Sembamurthy [1999] for 
details). All correlations between two or more respondents from the same organization on the main 
constructs are significant (p < 0.01). The results provide strong evidence of the consistency 
between responses from a single organization. As a consequence, we use the average scores from 
multiple respondents as the organizational response. For organizations with a single respondent, 
the individual response is used to represent the organization. 

We perform three tests for common method bias: Harman’s one-factor, partialling out a 
“marker variable,” and partialling out a general factor score (Podsakoff and Organ 1986; Podsa- 
Кой et al. 2003). The results indicate no significant common method variance that threatens the 
quality of the data. 


? In addition, 91 respondents reported that they were not the correct informants to answer the survey. Another 70 sent their 
apology by email and often quoted reasons for not responding to the survey (such as company policy). 

Early and late responses were compared in paired samples of 150, 100, 50, and 40 using an ANOVA test for nonresponse 
bias. The results show no significant differences on any of the study variables, including demographic and control 
variables. There is no indication of any nonresponse bias. 

An ANOVA test shows no significant differences between the individual and averaged responses. 

Harman’s one-factor test showed neither a single factor emerged from the exploratory factor analysis nor did one general 
factor account for the majority of the variance in the measurement items used in the model. In partialling out an 
unrelated “marker variable” as a surrogate for common variance, we used theoretically unrelated variabies (respondents’ 
age and years of experience) and examined the structural model both with and without the “marker variables.” The 
results suggest that neither of the two marker variables is statistically significant. In perzorming the partialling out a 
general factor score test, we added the highest factor from the “unrotated” exploratory factor analysis test to the PLS 
model as a control variable on the dependent and mediating variables in the model. It is assumed that this factor contains 
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Operationalization of Constructs 


ВІ Assimilation 


BI Assimilation is measured by adapting Armstrong and Sambamurthy's (1999) instrument 
for IT assimilation. Building on Porter's (1985) value chain framework, Armstrong and Sambam- 
urthy (1999) use 14 items to measure IT assimilation, six that capture IT assimilation in different 
business activities and eight items that measure IT assimilation in business generic strategies. 
Armstrong and Sambamurthy (1999) report three dimensions for IT assimilation: logistics activi- 
ties, marketing activities, and business strategies. A comparison of factors in Armstrong and 
Sambamurthy's (1999) measure with those identified by the literature as representative of ad- 
vanced MCS suggest that most attributes involving enterprise data are captured. One item, “mana- 
gerial processes," is added based on BI literature and feedback in a focus group meeting. This 
process results in a total of 15 items to measure BI assimilation. 

Exploratory factor analysis supports the use of only 14 items. We eliminate one item from the 
measurement list due to high cross-loadings (more than or closer to the threshold of 0.50 .P The 
result from the exploratory factor analysis shows the remaining 14 items load on three factors of 
BI assimilation. These factors are referred to as (1) customer relations, (2) business operations, 
and (3) marketing and sales. Each factor combines both generic strategies and business activities 
related to a specific business function. 


Organizational Absorptive Capacity 


The measures of absorptive capacity at the TMT and operational levels in this study reflect the 
two core elements of the absorptive capacity: prior relevant knowledge and intensity of effort 
(Cohen and Levinthal 1990; Kim 1908). We use the four modes of knowledge creation suggested 
by Nonaka (1994): socialization, externalization, combination, and internalization’ as the basis to 
measure intensity of effort for both operational and TMT levels of the organization. The four 
knowledge-creation modes represent the organization’s effort and ability to create and absorb new 
information from internal and external sources, convert it into new usable knowledge, and apply it 
to support strategic planning and execution of business strategies (Nonaka 1994; Davenport and 
Prusak 1998). 


Operational-Level Absorptive Capacity 

The measure of intensity of effort at the operational level consists of 25 items divided between 
socialization (seven items), externalization (six items), combination (five items), and internaliza- 
tion (seven items). Confirmatory factor analysis supports the use of only 22 items to measure the 


the best approximation of the common method variance (Podsakoff and Organ 1986; Podsakoff et al. 2003). The 
findings show the original results are not affected by the general factor included in the model. 

The new item “managerial processes” was removed from the measure of BI assimilation as it cross-loaded on to two 
factors. One plausible explanation for the cross-loading is that the item refers to broad managerial activities that span the 
whole value chain. Note also that our three dimensions of BI assimilation do not exactly align with those of Armstrong 
and Sambamurthy (1999). Our focus is on the whole value chain, while Armstrong and Sambamurthy (1999) tested the 
assimilation of “IT in general.” However, all of their items do load on the overall construct and the variance in 
sub-categorization is likely due to the diverse nature of the industries and the different type of IT used in our study. 
Nonaka (1994) and Nonaka and Takeuchi (1995) suggest that organizational knowledge creation is captured by four 
modes of knowledge conversion: socialization (the process of creating tacit knowledge through shared experience), 
externalization (the process of converting tacit knowledge into explicit knowledge), combination (the process of creat- 
ing explicit knowledge from explicit knowledge), and internalization (the process of converting explicit knowledge into 
tacit knowledge). Each of the four modes creates new knowledge of a specific type, and the dynamic between the four 
modes enables organizational knowledge creation. 

The derived measure of intensity of effort is based on three prior studies (Nonaka et al. 1994; Becerra-Fernandez and 
Sabherwal 2001; Choi and Lee 2002) that used Nonaka’s (1994) four modes of knowledge creation. However, the three 
studies were conducted in three different countries (Japan, U.S., and South Korea) and the measurement items used in 
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four modes of knowledge creation. We eliminate three items from the operational-level, 
knowledge-creation measure because they load below 0.50 on the relevant dimension (Hair et al. 
1998). We then use measurement indices produced by Smart PLS to examine the properties of 
measures for the knowledge-creation тойеѕ.!6 All four measures show satisfactory levels of in- 
ternal consistency and convergent and discriminant validity." We derive four composite variables 
from average scores corresponding to the four modes to proxy for intensity of effort by averaging 
the respondents’ scores. 

The prior relevant knowledge component of absorptive capacity captures the shared knowl- 
edge between line and IT managers. Nelson and Cooprider (1996) define shared knowledge as the 
understanding and appreciation among IT and line managers for the technologies and processes 
that affect their mutual performance. We use the five items suggested by Nelson and Cooprider 
(1996) to measure shared knowledge between line and IT managers, which represent two types of 
measures, multiplicative and general. 


(1) For multiplicative or interaction measures, respondents are asked to assess separately the 
role of IS and line managers for two characteristics: Items 1 and 2 capture understanding, 
while items 3 and 4 capture appreciation. Using the conceptualization of fit interaction (Ven- 
katraman 1989; Nelson and Cooprider 1996), we operationalize the two concepts of under- 
standing and appreciation by multiplying the two relevant items for each of the concepts (1.е., 
Кеті X item2 and item3 X item4). 

(2) For general measures, we ask respondents to rate the overall level of appreciation line 
managers and IS managers have for each other's accomplishments (item 5). 


This procedure results in three items to capture shared knowledge (one general item and two from 
the outcome of the multiplication). The multiplicative measures provide sironger evidence of the 
validity of the measurement instrument than would be possible from only one type of measure 
(e.g., general; Nelson and Cooprider 1996). This is because the distribution of the final measure 
score depends on the extent to which two indicators agree with each other. 

We derive the operational-level, absorptive-capacity score by creating the interaction term 
between the intensity of effort and shared knowledge variables. We use the two-step score con- 
struction procedure suggested by Chin et al. (2003) to estimate the composite variable of the 
interaction term because the measurement of one of the interaction variables (intensity of effort) is 
modeled formatively (Chin et al. 2003; Lu and Wang 2008). 


these three studies are not the same. One concern was that some items used could be driven by country-specific culture 
that may not be relevant to Australian organizations. To overcome this concern, we synthesized the knowledge-creation 
modes measures reported in the three papers, To improve measure validity, items in common to at least two of the three 
instruments were chosen for the initial measurement draft. The initial measurement draft was subjected to a number of 
pilot-tests including interviews with senior managers, feedback from academics, two focus group meetings, and a small 
survey. As a result of the pilot testing, seven additional items were added to the initial measurement list. 

We use Smart PLS to examine the measurement properties of the two measures (i.e., at both operational and TMT 
levels) of Nonaka's (1994) knowledge-creation modes using the two-stage method. The eight knowledge-creation modes 
(4 modes to proxy for TMT's intensity of effort and 4 to proxy for operational managers' intensity of effort) were 
concurrently examined in one model. The results indicate that each of the measurement modes shows high convergent 
and discriminant validity. 

The results indicate the following for all measures: composite reliability measures are above 0.70, the average variance 
extracted is above 0.50, items load higher on their associated factors than on any other factors, and the square root of 
AVE of all constructs is higher than its correlation with any other construct in the model. 

As Chin et al. (2003, 11) explain, the formative indicators are not assumed to reflect the same underlying construct (i.e., 
can be independent of one another and measuring different factors). Therefore, the product indicators (in our model the 
product of multiplying the four scores for intensity of effort by the three scores for shared knowledge) will not 
necessarily tap into the same underlying interaction effect. Following the two-step procedure, the first step entails using 
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TMT's Absorptive Capacity 

For the intensity of effort element of absorptive capacity, we develop draft measures for 
Nonaka's (1994) four knowledge-creation modes using prior literature on TMT (Hambrick 1981, 
1995; Hurst et al. 1989; Kakabadse et al. 1995; Kippenberger 1997; Wu et al. 2002). We include 
31 items in the initial measurement draft. We discussed the initial measurement items with two 
senior managers and a knowledge management consultant. We ask the three managers in three 
separate meetings (1) whether the items included in the list represent the knowledge-creation 
activities that usually take place at the TMT level, and (2) whether there are any additional 
knowledge-creation activities relating to the TMT that should be included. This process was 
followed by several pilot tests including feedback from academics, two focus group meetings, and 
a small survey. Feedback from participants during the pilot tests was analyzed and incorporated in 
the measurement draft when appropriate. We include 26 items in the final survey, representing the 
four modes: socialization (seven items), externalization (six items), combination (six items), and 
internalization (seven items). 

Exploratory factor analysis supports only 19 of the items for measuring knowledge creation at 
the TMT level. We eliminate seven items because they load below 0.50 on the relevant factor 
(Hair et al. 1998). АП measures of the four knowledge-creation modes show satisfactory levels of 
internal consistency and convergent and discriminant validity. 

We create four composite variables representing the four knowledge-creation modes by aver- 
aging the respondents' scores for each mode." The prior relevant knowledge component of ab- 
sorptive capacity is measured using three items suggested by Armstrong and Sambamurthy (1999) 
to capture TMT's strategic knowledge of IT. The results reported in Table 1 indicate that the 
measure of TMT's strategic knowledge has high internal consistency and convergent and discrimi- 
nant validity. We again follow Chin et al.'s (2003) two-step approach to create the TMT absorptive 
capacity composite scores using TMT’s strategic knowledge and intensity of effort composite 
variables. 


IT Infrastructure Sophistication 


We measure IT infrastructure sophistication using a scale adapted from Armstrong and Sam- 
bamurthy (1999), refining the scale by drawing on literature from BI, interviews with BI profes- 
sionals, and meetings with two focus groups. We assess the level of IT infrastructure sophistica- 
tion by asking respondents to indicate the extent to which their organization had diffused the ten 
IT infrastructure components (see Table 1). The final measurement instrument consists of ten key 
IT infrastructure components that capture two dimensions of enterprise systems related IT infra- 
structure (i.e., generic and specialized IT infrastructure) and combine to provide the basis for a 
shared data environment. 


the formative indicators in conjunction with PLS to create the underlying construct scores for predictor and moderator 
variables. In step two we use the two composite variables to create a single interaction term representing the absorptive 
capacity variable. 

9 We conducted sensitivity tests by creating composite variables for each knowledge-creation mode (at both operational 
and TMT level) using weighted averages whereby each item load is multiplied by the corresponding measure score. We 
added the outcomes and calculated the average as the composite variables (modes) score. Results remain the same. 
Measurement properties were discussed in footnote 14. 
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IV. RESULTS 
We use partial least squares (PLS), a component-based structural equation modeling tech- 
: : ; 20 
nique, to validate our constructs’ measures and test the research model and hypotheses.” The 
overall results are shown on the structural model presented in Figure 2. 


Measurement Properties 


We apply multiple tests suggested by Churchill (1979) and Straub (1989) to assess construct 
validity and reliability. First, we use either exploratory or confirmatory factor analyses, depending 
on the maturity of the measure, to examine the dimensions and loading of measurement items. 
Second, we simultaneously test the structural model and the measurement model using PLS. The 
output from PLS in relation to the measurement model verifies the initial results from the explor- 
atory factor analysis tests. Item loadings measure the significance of the item to the factor. For 
constructs measured with reflective indicators, we drop items with loadings below 0.70 from the 
construct measure as they indicate that less than 50 percent of the variance of the variable is 


FIGURE 2 
Structural Model Results for BI Assimilation 
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20 We use Smart PLS 2.0. PLS is a component-based structural equation modeling (SEM) technique that simultaneously 
tests the psychometric properties of the scales used to measure the constructs (i.e., measurement model) and examines 
the strength of the relations between the constructs (i.e., structural model) (Chin 1998; Hulland 1999). We use PLS 
instead of covariance-based SEM techniques (such as LISREL) because some of the constructs included in the research 
model are measured using formative indicators and using covariance-based modeling technique to test the research 
model can result in an unidentified model (Kline 2006). PLS is also suitable for analyzing complex models with 
mediating constructs and second-order constructs (Chin and Newsted 1999). A number of recent accounting research 
studies use PLS for similar reasons (e.g., Hall 2008; Dowling 2009; Chapman and Kihn 2009). 
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accounted for by the factor (Hair et al. 1998)?! However, we use a less strict rule of thumb for 
newly introduced measurement items (Chin 1998; Hulland 1999). To test formative construct 
validity we assess the indicators' weight rather than the loadings. 


Construct Validity 


Convergent Validity 


Item loading, together with the average variance extracted (AVE), captures the convergent 
validity of each of the measures for constructs that are modeled reflectively (Van den Bosch 1999). 
Table 1 shows all reflective measurement items have high and significant loadings, and all the 
weights of items in formative constructs are statistically significant indicating their significant 
contribution to the measured construct. The AVE for all constructs exceed 0.50 (ranging between 
0.52 and 0.86), supporting the convergent validity of the measurement items (Fornell and Larcker 
1981). 


Discriminant Validity 


Table 2 shows that the values of the square roots of the AVE are all greater than the inter- 
construct correlations. This indicates that all measures have appropriate discriminant validity. An 
additional test of discriminant validity assesses each measurement item to ensure that it has a 
higher loading on its assigned factor than on the other factors (Chin 1998; Gefen et al. 2000) (see 
Table 3). Each measurement item loads higher on the appropriate construct than on any other 
construct (Chin 1998; Gefen et al. 2000), providing additional support as to the discriminant 
validity of the measures. 


Hypotheses Testing 


We use Smart PLS 2.0 with bootstrapping as a resampling technique (1000 random samples) 
to estimate the structural model and the significance of the paths. We use path coefficients and the 
R? jointly to evaluate the model (Chin 1998). Table 4 presents the PLS analysis results. As shown 
in Table 4, of the three control variables included in the research model (revenue and number of 
employees representing firm size in different ways, and time since adoption), only time since 
adoption is significantly related to BI assimilation. This indicates that organizations with more BI 
experience are able to assimilate BI better than those with less BI experience. 

Overall, the results suggest the model has good predictability. The coefficients for all paths in 
the model, except one, are statistically significant at the 0.05 level. The results also indicate that 23 
percent of BI assimilation, 41 percent of operational-level absorptive capacity, and 14 percent of 
IT infrastructure sophistication are explained by the model. 

We test hypotheses within the structural equation model shown in Figure 2. Hypotheses that 
posit direct relations between constructs (H1, H2, H3, H4, and H5) were tested based on the 
magnitude and significance of path coefficients estimated using Smart PLS 2.0. We tested the 
hypothesis that posits a mediated (indirect) relation (H2a) by calculating the magnitude and sig- 


?! The measurement items of BI assimilation, shared knowledge at operational level, and TMT's strategic knowledge, 
knowledge-creation modes (at both operational and TMT levels), and IT infrastructure sophistication are modeled 
reflectively, while the measurement items of intensity of effort (at both operational and TMT levels) are modeled 
formatively. The categorization of a construct as a formative or reflective is not always clear-cut and is influenced by the 
researcher judgment (Chwelos et al. 2001; Dowling 2009). We did additional tests sim-lar to those reported in prior 
studies (Chwelos et al. 2001; Dowling 2009) whereby we examined different alternative measurement models, modeling 
all constructs in the model as formative, all the constructs as reflective, and various mcdels with different constructs 
modeled as formative or reflective. АП these tests show results similar to those reported in this paper as no paths gained 
or lost statistical significance or changed in sign. This suggests the results are not driven by how the constructs are 
modeled (i.e., formative or reflective). 
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TABLE 3 
Items Loading and Cross Loading” 


Elbashir, СоШек апа Sutton 


Constructs Measured Using Reflective Items 


Generic 
Infra- 
structure 


Specialized 
Infra- 
structure 


Marketing 
Customer Business and 
Items Relation Operation Sales 
ASAI 0.77 0.45 0.47 
ASA2 0.77 0.56 0.38 
AST3 0.75 0.46 0.53 
ASTS 0.80 0.44 0.51 
AST6 0.85 0.48 0.51 
AST7 0.73 0.43 0.58 
А5А5 0.49 0.83 0.30 
ASAT 0.34 0.74 0.29 
ASTI 0.38 0.69 0.20 
AST2 0.48 0.76 0.39 
AST4 0.58 0.83 0.36 
ASA3 0.52 0.26 0.84 
ASA4 0.51 0.32 0.84 
AST8 0.55 0.41 0.80 
ІСТІ 0.05 0.18 0.05 
ІСТ2 0.08 0.21 0.04 
ІСТ? 0.06 0.02 0.07 
ICT8 0.14 0.13 0.09 
ICT9 0.09 0.16 0.04 
ІСТІ0 0.22 0.17 0.10 
ICT3 0.36 0.23 0.22 
ICT4 0.35 0.23 0.20 
ІСТ5 0.37 0.18 0.32 
ІСТ6 0.25 0.27 0.20 
Shared1*2 0.21 0.18 0.30 
Shared3*4 0.25 0.22 0.27 
Shared5 0.25 0.21 0.26 
BEK1 0.15 0.13 0.15 
BEK2 0.15 0.16 0.15 
BEK3 0.29 0.19 0.17 
SOCO 0.35 0.39 0.23 
EXTO 0.38 0.41 0.33 
COMO 0.31 0.36 0.28 
INTO 0.23 0.27 0.28 
SOCS 0.24 0.26 0.22 
EXTS 0.20 0.20 0.21 
COMS 0.32 0.27 0.38 
INTS 0.17 0.16 0.17 
FIRMSIZRE -60.03 0.02 0.04 
FIRMSIZEM —0.08 —0.01 —0.05 
YEAR 0.08 0.15 0.17 
Composite 0.92 0.88 0.87 
Reliability 

АУЕ 0.61 0.60 0.64 
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Shared 

Knowledge TMT- 
(Operational) Knowledge 

0.18 0.17 
0.12 0.17 
0.19 0.08 
0.16 0.15 
0.30 0.19 
0.22 0.27 
0.12 0.17 
0.15 0.12 
0.19 0.11 
0.21 0.14 
0.18 0.14 
0.18 0.09 
0.26 0.12 
0.30 0.22 
0.11 0.03 
0.05 0.06 
0.10 —0.03 
022 0.10 
0.22 0.13 
0.27 0.13 
0.14 0.12 
0.16 0.12 
0.15 0.03 
0.11 0.00 
0.89 0.46 
0.95 0.44 
0.94 0.43 
0.42 0.89 
0.45 0.93 
0.41 0.86 
0.42 0.33 
0.34 0.35 
0.41 0.33 
0.52 0.42 
0.36 0.35 
0.53 0.48 
0.47 0.47 
0.50 0.58 
—0.20 —0.19 
—0.21 —0.18 
—0.03 0.07 
0.95 0.92 
0.86 0.82 
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Constructs Measured Using Formative Items OR with One Item 


Intensity 
of Effort 


(Operational) 


0.30 
0.22 
0.26 
0.26 
0.40 
0.35 


Intensity 
of Effort 


(TMT) 


0.17 


(Control Variables) 


Size 
(Revenue) 
—0.03 
—0.06 

0.06 
—0.03 
—0.04 


Size 
(Employees) 


—0.06 
—0.07 
0.03 
—0.04 
-0.12 
—0.04 


Time 
Since 
Adoption 


0.09 
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Constructs Measured Using Formative Items OR with One Item 
(Control Variables) 


Intensity Intensity Time 

of Effort of Effort Size Size Since 
Items (Operational) (TMT) (Revenue) (Employees) Adoption 
Composite Reliability МА? МА? МА“ МА“ МА“ 
АМЕ МАР МА? NAS МАС NAS 


а Cross-loading is obtained by calculating the correlation of the standardized latent variable scores with the standardized 
value of the item (Pavlou and Gafen 2005). 

> Composite reliability and AVE will only be suitable to use when the construct is measured with reflective indicators. 

* Control variables that are measured with one item. 


TABLE 4 
Path Coefficients: Test and Control Variables 
Path t-statistic 


Path Coefficient (z-score) 
Hypotheses Operational-level AbsCapacity — Assimilation of BI systems 0.33 6.37*#* 
Direct Effect (H1) 
TMT's AbsCapacity — Operational-level AbsCapacity (H2) 0.64 18.47*** 
IT Infrastructure Sophistication — Assimilation of BI (H3) 0.21 3.47 
Operational-level absorptive capacity ... IT Infrastructure 0.40 6.87*** 
Sophistication (H4) 
TMT's AbsCapacity — IT Infrastructure Sophistication (H5) —0.04 0.53 
Hypotheses TMT'S AbsCapacity — Operational-level AbsCapacity — BI 0.21: (6.10***) 
Indirect assimilation (H2a) 
Effect TMT’S AbsCapacity — Operational-level AbsCapacity — IT 025* (6.43%%%) 
Infrastructure Sophistication (ad hoc test for H5) 
Control Size (No. of employees) — BI assimilation —0.09 1.61 
Variables Size (Revenue) — BI assimilation 0.05 0.89 
Time since adoption of BI — BI assimilation 0.10 1.98* 


* t-statistic > 1.96 is significant at p < 0.05 level. 

ЖЖ t statistic > 2.58 is significant at p < 0.01 level. 
ЖЖЖ t statistic > 3.29 is significant at p < 0.001 level. 
8 Standardized path. 


nificance of mediated paths (Hoyle and Kenny 1999; Subramani 2004). Additional tests examine 
whether the indirect path represents full or partial mediation (Baron and Kenny 1986; Mathieson 
et al. 2001). 

Hypothesis 1 predicts that increased levels of operational-level absorptive capacity will en- 
hance the levels of BI assimilation. The results shown in Figure 2 and Table 4 support the 
hypothesis with a strong and significant relationship (0.33, p < 0.001). This result supports the 
belief that operational and line managers' knowledge and knowledge-creation activities (i.e., ab- 
sorptive capacity) have a positive influence on assimilation. 

Hypothesis 2 predicts that increased levels of TMT absorptive capacity enhance the 
operational-level absorptive capacity. The results shown in Figure 2 and Table 4 again support the 
hypothesized relationship with a strong and significant relationship (0.64, p « 0.001). This finding 
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provides support for the integrative effects of cultural controls that primarily emanate from TMT- 
level management and the subsequent impact on organizational learning and knowledge creation 
at the operational and line levels. 

Hypothesis 2a elaborates on this relationship to examine the connection that occurs between 
the outcome of cultural controls and the core MCS controls provided by BI systems (e.g., planning 
and cybernetic controls). Hypothesis 2a examines supporting components of this relationship by 
looking specifically at the relationship between TMT absorptive capacity and BI assimilation with 
the theorization that this relationship will occur through operational-level absorptive capacity. 
Using the Hoyle and Kenny (1999) approach to test the mediated relationship, we first look at the 
indirect effect which is strong and significant (0.21, p « 0.001) (see Table 4). We subsequently test 
whether the mediation is partial or full (Baron and Kenny 1986). Results, as shown in Table 5, 
confirm full mediation. This provides evidence of the link between cultural controls and other core 
MCS controls provided by BI, but also reflects the importance of line and operational managers 
achieving this interrelationship and successful integration of MCS as a package. 

Hypothesis 3 predicts that IT infrastructure sophistication will enhance BI assimilation. The 
results shown in Figure 2 and Table 4 support the hypothesis (0.21, p « 0.05). This supports the 
belief that underlying enterprise systems-related IT infrastructure plays an important role in BI 
assimilation. 

Hypothesis 4 predicts increased levels of operational-level absorptive capacity will enhance 
levels of IT infrastructure sophistication. The results in Figure 2 and Table 4 show a strong and 
significant relationship (0.40, p < 0.001) supporting the theorized relationship. This finding high- 
lights the key role of operational and line managers in driving development of appropriate IT 
infrastructure. 

Hypothesis 5 is the only hypothesis not supported. Hypothesis 5 predicts that increased levels 
of TMT absorptive capacity will enhance levels of IT infrastructure sophistication. This finding is 
surprising, given that several earlier MCS studies report a strong relationship between TMT 
involvement and successful implementation of specific MCS components. We conduct ad hoc tests 
to determine if this relationship is offset by operational-level absorptive capacity, which we found 
to be strongly linked to IT infrastructure sophistication. The most likely scenario from a theoretical 
perspective would be that TMT absorptive capacity influences IT infrastructure sophistication 
indirectly through operational level absorptive capacity. Tests for this mediation effect, using the 
same techniques for testing H2a, are significant, and reflect full mediation (see Tables 4 and 5). 

While this latter finding is inconsistent with our a priori theoretical model, the finding is not 


TABLE 5 
Nested Model Comparison 
The 
Magnitude 


R? in Nested В? м Models _ of the 
Model (No with Direct Changein Pseudo Е = 
Direct Paths Direct Path) Path R?) fX(n-k-1) Conclusion 


TMT's absorptive capacity to 0.228 0.231 0.004 1.33 Not significant 
BI assimilation: An 
additional test for (H2a) . 

ТМТ” absorptive capacity to 0.138 0.139 0.001 0.40 Not significant 
IT infrastructure: Ап 
additional test for (H5) 
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necessarily inconsistent with an absorptive-capacity view of assimilation. Prior studies focus on 
TMT involvement and specific MCS implementation, whereas we focus on absorptive capacity 
and broad-based MCS implementation. Our findings could be interpreted as indicating the critical 
importance that operational and line managers play in the integration of technology with business 
processes and the importance at tbe user level of recognizing where business value can be derived. 
This finding should be investigated further in future studies. 


V. DISCUSSION AND CONCLUSION 

Recent research indicates that MCS are evolving to provide a more strategic view of value- 
chain activities including product development, sales and marketing, customer relations, and per- 
formance management (Davila and Foster 2005, 2007). In just the last few years, we have seen a 
shift in the fundamental role of the management accountant to consistently include internal analy- 
sis and risk management activities (DeLoo et al. 2009). This is reflected in the Institute of Man- 
agement Accountants’ (IMA) December 2008 release of a Statement on Management Accounting 
that changed the definition of a management accountant for the first time in 30 years (IMA 2008, 
1): 

Management accounting is а profession that involves partnering in management decision making, 

devising planning and performance management systems, and providing expertise in financial 

reporting and control to assist management in the formulation and implementation of an organiza- 

tion's strategy. 


In this study, we examine the assimilation of BI systems that automate MCS as a package and 
support an ever-increasing number of these contemporary responsibilities for management accoun- 
tants. We examine BI systems provided by one vendor whose product :s considered a leader in 
providing MCS capability. We focus our study on the cultural controls associated with knowledge 
creation (i.e., absorptive capacity). The resulting organizational absorptive capacity impacts suc- 
cessful assimilation of BI systems. 

Our results provide evidence that increased absorptive capacity among operational-level man- 
agers is strongly related to both increased levels of BI assimilation and increased levels of sophis- 
tication in the underlying technology infrastructure that enables BI systems. At the same time, 
operational-level absorptive capacity is strongly tied to TMT's absorptive capacity, providing 
evidence of the importance of cultural controls related to knowledge management and resource 
development. These results have implications not only for the role of cultural controls in fostering 
an effective knowledge culture, but also for the view among many researchers that MCS should be 
considered as a package—i.e., recognizing that specific MCS components are impacted by the 
surrounding MCS controls as a whole (Chenhall 2003; Malmi and Brown 2008). 

The results for BI assimilation provide some evidence of a potential diffusion effect on the 
management accounting function as the scope of MCS broadens. Several researchers have ob- 
served indications of this diffusion effect due to the adoption of enterprise systems and associated 
increased capability for MCS (see Granlund and Malmi 2002; Caglio 2003; Scapens and Jazayeri 
2003; Quattrone and Hopper 2005). Our results indicate that organizations with greater absorptive 
capacity assimilate BI across business operations, customer relations, and marketing and sales—a 
phenomenon that could be perceived as showing widespread change consistent with these obser- 
vations in earlier studies of specific organizations. Future research should explore this diffusion 
phenomenon in order to better understand the impact on management accountants’ roles in orga- 
nizations and the MCS function overall. 

Our research also contributes to understanding the roles of TMT in MCS innovations through 
an expanded view of organizational absorptive capacity. In particular, we shift the focus of ab- 
sorptive capacity to a dynamic perspective of knowledge and incorporate a measurement of in- 
tensity of effort that has not been considered in prior research on the link between TMT involve- 
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ment and MCS innovation success. Focusing on this dynamic view of absorptive capacity, we 
incorporate the knowledge-creation capabilities that are critical to adapting to new information and 
new technologies. Incorporating this broader measure of absorptive capacity, we are able to pro- 
vide evidence that both operational-level and TMT absorptive capacity are likely to influence the 
assimilation of a broad range of strategic systems related to MCS. At the same time, we also show 
evidence that the absorptive capacities of the operational-level managers are fundamental to es- 
tablishing the robust IT infrastructure necessary to support strategic systems that support MCS 
innovations. Operational-level managers' absorptive capacity is similarly key to organizational 
assimilation of strategic technologies to gain strategic advantage. Prior research has failed to 
establish the link between TMT's knowledge and assimilation of strategic systems (e.g., Arm- 
strong and Sambamurthy 1999). Our results confirm the TMT effect as indirect and flowing 
through the operational level. Future research should further explore this indirect effect in order to 
better understand the mechanisms through which TMT's knowledge сап be leveraged through 
cultural controls. Further, we have focused on one particular aspect of cultural controls, absorptive 
capacity. Future research should also consider the broader range of cultural controls and the effect 
on successful assimilation of MCS. 

The findings related to absorptive capacity effects on IT infrastructure are also of interest. 
There is a contrast between operational-level absorptive capacity, which has a significant influence 
on IT infrastructure sophistication and TMT's absorptive capacity, which does not have a direct 
effect. Post hoc tests reveal an indirect relationship between TMT's absorptive capacity and IT 
infrastructure with operational-level managers' absorptive capacity fully mediating the relation- 
ship. These findings indicate that sophisticated IT infrastructure is developed through socially 
complex processes that involve collaborations between TMT, IT specialists, and operational-level 
managers, The cultural controls put in place by TMT have a substantial effect on these collabo- 
rations. Moreover, the results also support the calls made in both the accounting and information 
systems literature for management accountants, line managers, and IT managers to develop 
broader overlapping knowledge, interpersonal skills, and deeper understanding of the strategic 
content of the organization (Todd et al. 1995; Nelson and Cooprider 1996; Caglio 2003; Byrd et al. 
2004; Dechow and Mouritsen 2005; Quattrone and Hopper 2005; Arnold 2006; Rikhardsson and 
Kraemmergaard 2006). 

In weighing the results of this study, several inherent limitations should be considered. First, 
although an attempt was made to solicit multiple respondents from each of the targeted firms, only 
a single contact was available for many of the firms. However, tests show no differences between 
mean responses received from single respondents versus those from multiple respondents. Second, 
using the same informants to answer questions on both operational- and strategic-level constructs 
may result in a respondent's bias or knowledge bias. However, techniques used in the study help 
to alleviate some of that concern. These include providing respondents with a "No Basis for 
Answering" option in the survey, capturing the same data from multiple respondents, testing for 
consistency between multiple respondents, testing for common method variance, and testing for 
discriminant validity. All of these tests suggest that such concern does not threaten the validity of 
the results. Future studies could consider using two separate surveys. One would be answered by 
TMT members and include questions on the TMT characteristics, and the other by middle-/ 
operational-level mangers and include questions on BI assimilation, IT infrastructure, etc. Third, 
the study investigates TMT knowledge and argues that it enables the cultural controls that are 
necessary for building organizational absorptive capacity, appropriate IT resources, and supporting 
BI assimilation. Future research could examine the direct effects of cultural controls on the 
cybernetic and planning controls that are most enabled by BI systems. 

This research has significant practical implication for management accountants, organiza- 
tional management, managers using MCS, IT consultants, and technology vendors. The clearest 
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message is that successful provision of MCS capability is not achieved by acquiring "state-of-the- 
art" software, but arises through developing appropriate organizational capability for generating 
and using new knowledge to support MCS. This reasoning also emphasizes that technology 
innovation is driven from the bottom-up by operational managers, as apposed to the dominant 
view of top-down, top-management-driven innovation. The top-down view is traditionally per- 
ceived as the driving force behind MCS capability. The findings from -his study also reinforce 
organizations’ continuous effort in knowledge-creation activities at both TMT and operational 
levels. Organizations should promote knowledge-creation activities as an ongoing process in pref- 
erence to purposeful and directed short-term knowledge acquisition suck as training programs. 
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ABSTRACT: This study examines disaggregated management forecasts as a mecha- 
nism to reduce investors' fixation on announced earnings. Our experimental results 
suggest that investors' earnings fixation is reduced when they initially observe a disag- 
gregated management forecast (earnings and its components) versus when they ob- 
serve an aggregated forecast (earnings only). We also provide theory-consistent evi- 
dence that this reduction in earnings fixation is associated with investors interpreting 
the summary net income figure as one of several similarly important evaluation inputs 
rather than a substantially more important input (relative to its components). Finally, we 
provide evidence that suggests our results are not bounded by the level of emphasis on 
net income in the subsequent earnings announcement, and not fully explained by three 
plausible alternative explanations. Our study extends the voluntary disclosure literature 
by providing evidence that the form of management disclosures can influence investors' 
interpretation of subsequently announced information, and contributes to practice by 
providing a potential alternative to stopping earnings guidance. 
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tendency of investors to focus on companies' short-term results in lieu of long-term potential. 
Often this short-termism is attributed to investors’ excessive reliance on accounting earnings as the 
primary input to investment judgments, without fully considering other information that may be 
relevant to evaluating the company's investment prospects (Graham et al. 2005; CFA Centre for 
Financial Market Integrity 2006). Further, this excessive reliance on accounting earnings (1.е., 
earnings fixation) is often cited as the key source of managers' myopic behavior and associated 
reduction in long-term value delivered to shareholders (CFA Centre for Financial Market Integrity 
2006, 2007).! 

Although earnings fixation represents an institutional phenomenon that can be undesirable to 
managers and investors, prior research has primarily focused on documenting the existence and 
extent of earnings fixation rather than on mechanisms that reduce this fixation. We seek to fill this 
gap by examining a mechanism to reduce earnings fixation. Specifically, we examine whether and 
how the level of aggregation of a management forecast can reduce the excent to which investors' 
investment judgments reflect earnings fixation. 

We focus on management forecasts because (1) they are often invesiors' first exposure to a 
firm's expectation of its future bottom-line summary earnings, and (2) management has full dis- 
cretion over the format of the forecast—specifically, the items for which a forecast is provided. 
Furthermore, several stakeholder groups have advocated the potentially costly approach of elimi- 
nating management forecasts to reduce fixation (Chen et al. forthcoming; Houston et al. 2010). We 
examine a potential alternative to eliminating guidance as a mechanism to reduce earnings fixa- 
tion: simply altering the level of aggregation in the guidance provided. 

We predict that investors' susceptibility to earnings fixation will decrease (increase) when 
investors initially observe a management forecast that disaggregates (aggregates) the components 
of forecasted earnings. Our prediction is guided by research in psychology and marketing that 
suggests that because information can often be perceived in several different ways, the interpre- 
tation of that information depends on the particular concept or knowledge structure that is cur- 
rently active (Higgins and King 1981; Wyer and Srull 1981; Yi 1990a, 1990b). | 

More specifically, we contend that investors possess at least two general concepts related to 
net income—one conceptualization of net income as one of several important inputs to evaluating 
a company's prospects, and another as a sufficient and convenient summary measure, and thus a 
substantially more important measure, of a company's prospects (relative to its components). We 
predict that when investors observe a disaggregated earnings forecast, which places net income 
among a list of other potentially relevant inputs, the former conceptualization will likely be 
primed. In contrast, when investors observe an aggregated earnings forecast, where net income is 
isolated, the latter conceptualization of net income will likely be primed. Once the investor ob- 
serves net income in the subsequent earnings announcement, the primed knowledge structure will 
be activated and used to evaluate net income. Thus, the investment judgments of investors who 
initially observe a disaggregated (an aggregated) earnings forecast are expected to display less 
(more) earnings fixation because they evaluate net income as one of several similarly important 
inputs (as a substantially more important input) in assessing the company's prospects. 

We use an experiment to test these predictions. Participant-investcrs start by observing a 
management earnings forecast and making initial investment judgments. We manipulate the fore- 
cast such that it is disaggregated or aggregated. Consistent with Hirst et al. (2007), disaggregated 
earnings forecasts are defined as forecasts of the bottom-line net income number as well as 


! The CFA Centre for Financial Market Integrity's (2006) report titled "Breaking the Short-Term Cycle" defines short- 
termism as the “excessive focus of some corporate leaders, investors, and analysts on short-term, quarterly earnings and 
a lack of attention to the strategy, fundamentals, and conventional approaches to long-term value creation." 
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forecasts of each line item on the income statement (e.g., revenues, research and development 
expenses, etc.). Aggregated earnings forecasts are defined as forecasts of the bottom-line net 
income number alone. Participants then observe an earnings announcement and make a second set 
of investment judgments. Although the earnings announcement always reports a net income figure 
that matches forecasted net income, we manipulate whether the underlying components of net 
income indicate more favorable or less favorable performance. This research design allows us to 
infer earnings fixation when participants’ investment judgments are relatively the same regardless 
of the favorability indicated by the net income components, as such, judgments merely reflect the 
fact that the announced net income figure is identical across favorability conditions. In contrast, 
fixation is taken to be reduced when participants' judgments are more (less) favorable when the 
reported net income components indicate more (less) favorable performance. 

As predicted, participants' investment judgments reflect less earnings fixation when they 
initially observe a disaggregated management earnings forecast. Specifically, participants who 
initially observe a disaggregated forecast provide investment judgments that are more (less) fa- 
vorable when tbe net income components are more (less) favorable. In contrast, participants who 
initially observe an aggregated forecast provide investment judgments that are insensitive to the 
reported levels of the net income components. This evidence suggests that the level of aggregation 
in the initial management earnings forecast influences the extent to which participants’ investment 
judgments reflect line items other than bottom-line net income (i.e., the extent of earnings fixa- 
tion). 

We also find evidence consistent with our theory that this reduction in earnings fixation is 
associated with investors' interpreting the summary net income figure as one of several similarly 
relevant evaluation inputs rather than a substantially more important input (relative to its compo- 
nents). We ask participants to allocate 100 points among the bottom-line net income number and 
its components to indicate each item's importance in evaluating the company as a potential in- 
vestment. Participants who observe a disaggregated forecast assign less (more) importance to the 
bottom-line net income number (net income components) than those who observe an aggregated 
forecast. This result supports our contention that the disaggregated forecast leads investors to 
consider net income as one of several similarly important inputs in evaluating the company as a 
potential investment. 

In additional analyses, we find evidence that that our results cannot be fully explained by (1) 
differences in available information across forecast aggregation levels, (2) management signaling 
precisely which financial statement items are relevant by disaggregating the forecast, or (3) fore- 
cast aggregation levels affecting investors’ perceived credibility of the forecast. We also find that 
the effect of a disaggregated forecast is nearly identical regardless of whether the subsequent 
earnings announcement emphasizes or deemphasizes the bottom-line net income number. Thus, 
the effect of disaggregating the earnings forecast does not appear to be bounded by the level of 
emphasis of the bottom-line net income number in the subsequent earnings announcement. 

This study makes several contributions to the extant literature. First, we extend accounting 
research that documents the existence of investors' fixation on earnings (e.g., Bushee 1998) by 
proposing a potentially attractive approach to reduce fixation in investors' judgments. Second, we 
extend the disclosure literature in accounting by providing evidence that benchmarks, or forecasts, 
cannot only be used to evaluate current and prior period information (Schrand and Walther 2000; 
Krische 2005), but can also affect how investors interpret subsequently reported accounting infor- 
mation. Specifically, we show that a disaggregated earnings forecast can lead investors to interpret 
subsequently announced net income to be one of several similarly important inputs in evaluating 
a company's future prospects. Third, we extend the disclosure literature on the effects of forecast 
characteristics on investors' responses to subsequently announced earnings (Libby and Tan 1999; 
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Tan et al. 2002; Libby et al. 2006; Han and Tan 2007) by examining the influence of a different 
characteristic—the level of forecast aggregation—on investors’ response to subsequently an- 
nounced earnings. 

This study also extends recent research on the influence of the level of forecast aggregation on 
investor judgments. Hirst et al. (2007), for example, report results suggesting that investors judge 
disaggregated forecasts as more credible than aggregated forecasts and deem subsequently re- 
ported net income to be of higher quality. One implication is that investors should rely on subse- 
quently reported earnings more when they initially observe a disaggregated forecast. By showing 
that the level of aggregation influences not only investors’ judgments of forecast credibility, but 
also the extent to which investors' judgments reflect earnings fixation, we refine our understanding 
of the relationship between forecast aggregation level and investors' reactions to subsequently 
reported earnings. 

Finally, this study contributes to practice. In particular, we provide evidence that modifying 
the format of management earnings guidance may reduce investors' susceptibility to earnings 
fixation, while avoiding the potentially costly effects associated with eliminating guidance. This 
evidence sbould be important to both managers, who have incentives to reduce pressures for 
myopic bebavior, and regulators, who have expressed concerns about the effects of investors' 
earnings fixation. 

Section II discusses relevant background information and develops our hypothesis. Section III 
describes our experiment. Section IV presents our main results along with evidence of our theory 
and describes an alternative theoretical explanation. Section V presents supplemental analyses. 
Section VI concludes. 


П. BACKGROUND AND HYPOTHESIS DEVELOPMENT 

Company managers, financial market participants, regulators, and other stakeholders have 
each raised concerns about the tendency of investors to fixate on earnings (see Bushee 2001; 
Graham et al. 2005; CFA Centre for Financial Market Integrity 2007). Earnings fixation refers to 
investors’ excessive reliance оп a company's accounting earnings in evaluating the firm, without 
fully considering other information that is relevant to evaluating the company's investment pros- 
pects (see Graham et al. 2005).? Although its effect on the U.S. economy is not easily quantified, 
investors' earnings fixation and related short-termism is commonly depicted as detrimental to the 
judgments and decisions of both managers and investors (Bushee 1998, 2001; Grabam et al. 
2005). 

We argue that earnings fixation occurs when investors conceptualize net income as a substan- 
tially more important input (relative to its components) in evaluating a company's prospects, rather 
than as one of several similarly important inputs. Research in psychology suggests that investors’ 
initial conceptualization of a net income figure is likely to persist when subsequently exposed to 
the net income figure (Fischhoff 1977). Although announced earnings are undoubtedly important 
to investors, earnings forecasts often represent investors' initial exposure to a net income amount 
(The Economist 2006; Houston et al. 2010). Thus, our study focuses on earnings forecasts as ап 
opportunity to affect investors' initial conceptualization of net income. 

Research in accounting identifies several reasons for companies to issue forecasts. For ex- 
ample, management forecasts help to calibrate analysts' forecasts to minimize analyst forecast 
errors and forecast dispersion (e.g., Ajinkya and Gift 1984; Houston et al. 2010) and guide 


2 Ош definition of earnings fixation is consistent with the idea of fixating on earnings without regard to how bottom-line 
net income is derived (Murray 1991). 
See Graham et al. (2005) for a discussion of managers’ myopic decision-making in response to a perceived investor 
fixation on earnings. 
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analysts’ forecasts toward beatable earnings targets (e.g., Cotter et al. 2006). Management guid- 
ance can also reduce litigation risk associated with not warning investors of impending bad news 
(e.g., Skinner 1994, 1997; Johnson et al. 2001). In addition, issuing forecasts can aid in building 
a reputation for transparent reporting (e.g., Graham et al. 2005; Hutton and Stocken 2009). 

Recently, several stakeholders have advocated the elimination of earnings forecasts, including 
the CFA Institute and the Business Roundtable, the National Investor Relations Institute, the 
Aspen Institute's Corporate Values Strategy Group, the Committee for Economic Development, 
the U.S. Chamber of Commerce, members of Congress, and members of the Securities and 
Exchange Commission (Janjigian and Ozanian 2006; Houston et al. 2010). Also, several compa- 
nies have opted to eliminate earnings guidance (e.g., McDonald's Corp. and Merrill Lynch; see 
also Chen et al. forthcoming; Cheng et al. 2007; Houston et al. 2010). These parties argue that 
earnings forecasts lead investors to fixate on earnings that, in turn, direct managers to engage in 
myopic behavior that often runs counter to maximizing the company's long-term growth and 
shareholder value (Graham et al. 2005; Houston et al. 2010). 

Eliminating earnings forecasts could be costly to the company, however. For example, elimi- 
nating forecasts may not be a realistic option for smaller companies that are less closely followed 
by investors (McCafferty 2007). In addition, research in this area finds a negative market reaction 
and a deterioration of the information environment for firms that stop providing quarterly guidance 
(Chen et al. forthcoming; Houston et al. 2010). Specifically, Houston et al. (2010) provide evi- 
dence of a decrease in analyst coverage of "guidance stoppers," and Bowen et al. (2008) suggest 
that a decrease in analyst coverage leads to an increase in the cost of raising equity capital. 

We propose that, rather than eliminate forecasts, managers might consider changing the type 
of forecasts they issue. Managers provide forecasts with different levels of aggregation. For 
example, a recent examination of S&P 500 companies indicates that of the 63 percent of firms that 
give earnings guidance, 23 percent provide highly aggregated forecasts that forecast earnings 
alone, while 22 percent provide highly disaggregated forecasts that forecast several components of 
earnings (see Lansford et al. [2010] and forecast examples at Exhibit 1). This research suggests 
that managers’ choice of whether to issue aggregated or disaggregated forecasts is a function of 
their desire to signal their ability to forecast, enhance the credibility of earnings forecasts, and 
respond to external demand for additional information. We argue, however, that the level of 
managements' forecast aggregation can also affect the extent of investors' earnings fixation. 

We predict that a management forecast that is more disaggregated will reduce earnings fixa- 
tion. Prior research in psychology and marketing suggests that because information can often be 
perceived in several different ways, the interpretation of that information depends on the particular 
concept or knowledge structure that is currently active (Higgins and King 1981; Wyer and Srull 
1981; Yi 1990a, 1990b). More specifically, because individuals do not undertake an exhaustive 
memory search for all potential useful concepts, the active or accessible concepts are more likely 
to be used to process subsequent information (Wyer and Srull 1981; see also Tversky and Kah- 
neman 1973, 1974; Taylor and Fiske 1979). 

More closely related to our study, Yi (1990a, 1990b) examined the way an advertising context 


> 


We acknowledge that there can also be costs to issuing a disaggregated forecast. For example, providing detailed 
guidance of a firm's operations may assist competitors. In addition, for firms with complex operations, it may be more 
difficult for its managers to predict the firm's earnings components (see Lansford et al. 2010 for a discussion). Thus, any 
given manager must weigh the costs associated with stopping guidance against the costs (benefits) of issuing disaggre- 
gated guidance. 
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EXHIBIT 1 
Examples of Management Forecasts 


Panel А: Aggregated Management Forecast" 


Xerox expects first-quarter 2006 earnings in the range of 20-23 


cents per share. The company also reiterated its full-year 2006 
guidance of $1.00-$1.07 per share. Mulcahy indicated that she 
now expects the company will deliver full-year earnings in the high 
end of this range. 





Panel B: Disaggregated Management Forecast" 


(intel Leap ahead" 
2006 Outlook 


Revenue: Expected to be between $9.1 billion and $9.7 billion. 
Gross margin: 59 percent, plus or minus a couple of points 
(excluding share-based compensation effects of approximately 1 
percent). 

Expenses (R&D plus MG&A): Approximately $3.3 billion 
(approximately $3 billion excluding share-based compensation 
effects of approximately $300 million). 

Gains from eguity investments and interest and other: 
Approximately $140 million. 

Tax rate: Approximately 32 percent. 

Depreciation: $1.1 billion, plus or minus $100 million. 
Amortization of acquisition-related intangibles and costs: 
Approximately $20 million. 





a Xerox released an aggregated management forecast, which provided a forecast of earnings only. 
> Intel released a disaggregated management forecast, which provided a forecast of earnings components. 
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could affect the processing of product information in print ads.? Yi (19902, 1990b)finds that prior 
exposure to contextual factors can prime certain product attributes and, in turn, increase the 
likelihood that consumers interpret product information in terms of these activated attributes. 
Specifically, Yi (19902) conducts an experiment in which consumers were given an automobile ad 
that highlights the car's “large size." Noting that consumers likely have knowledge structures of а 
car's "large size" that depict the attribute either as a favorable input for car safety or an unfavor- 
able input for fuel efficiency, Yi (19902) argues that the context surrounding the ad can affect 
which concept or knowledge structure is activated and used. Specifically, the author predicts that 
placing the ad within an article about safety primes the “саг safety" knowledge structure, making 
it more likely that consumers interpret the car's large size as a favorable input for car safety. Yi's 
(19902) results support his prediction. Consumers provided more favorable judgments about the 
advertised car when it was located within the car safety article than when it was not. 

We contend that investors have at least two general concepts or knowledge structures related 
to net income. Prior research in accounting suggests that investors sometimes conceptualize net 
income as one of several important inputs to evaluating a company's prospects (see Elliott et al. 
2008), likely obtained from formal education (e.g., see Penman 2007), investing guides, or expe- 
rience. However, many have suggested that investors often forgo this net income conceptualization 
and instead conceptualize net income as a sufficient and convenient summary measure of a com- 
pany's prospects (e.g., see CFA Centre for Financial Market Integrity 2007).° The latter concep- 
tualization leads investors to fixate on a bottom-line net income number in subsequent earnings 
announcements. 

We argue that an aggregated forecast isolates net income, priming the conceptualization of net 
income as a convenient summary measure and, thus, as a substantially more important measure of 
company prospects relative to its components. By contrast, a disaggregated forecast places net 
income among a list of other potential inputs in judging the company's prospects. In this case, 
similar to consumers in Yi's (19902) study, we propose that a disaggregated forecast places net 
income in a context that primes or makes salient investors’ knowledge structures of net income as 
one of several similarly relevant inputs in evaluating the company's prospects. Once the investor 
observes net income in the subsequent earnings announcement, the primed knowledge structure of 
net income as one of several similarly relevant inputs is activated." Thus, investors' evaluation of 
net income in assessing the company's prospects is made in terms of the activated knowledge 
structure. As such, we expect the investment judgments of investors who initially observe a 
disaggregated (an aggregated) earnings forecast to display less (more) earnings fixation because 
they evaluate net income as one of several similarly relevant inputs (as a substantially more 


5 vi (19908) contemplates both cognitive and affective priming effects on attribute interpretation. Our interest in the effect 
of forecast display suggests a cognitive priming effect would be more applicable to our study. While Yi (19902) 
documents that both cognitive and affective primes can influence attribute interpretation, the author finds that cognitive 
primes influence attribute interpretation independent of affective primes and that the two types of priming do not 
interact. Thus, considering affective priming is beyond the scope of our study. 

We note that net income can sometimes be a sufficient measure of a company's prospects. Thus, treating net income as 
a summary measure of a company's prospects may also result from experiencing the past sufficiency of using net 
income in this manner. 

7 While disclosures other than forecasts may also precede any given earnings announcement, or шау occur between a 
forecast and the related earnings announcement, few, if any, of these other disclosures are likely to cue investors to think 
of the summary net income figure as one of several similarly important inputs as opposed to the most important input 
(or at least, not to the same extent as forecasts). In addition, forecasts other than management forecasts, such as analyst 
forecasts, may also cue investors to conceptualize the summary net income figure differently and could either exacerbate 
or attenuate our predicted effects depending on the level of aggregation of the analyst forecast, the timing of the analyst 
forecast disclosure relative to the management disclosure, and investors' desire and ability to access the analyst forecast. 
We focus on management as opposed to analyst forecasts because we are interested in examining mechanisms that 
managers can use to reduce investor earnings fixation (other than simply eliminating forecasts altogether). 
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relevant input) in assessing the company’s prospects. Thus, we derive ом: primary hypothesis as 
follows: | 


Hypothesis: After receiving a company's earnings announcement, the level of earnings fixa- 
tion reflected in investors' investment judgments will be lower (higher) when 
they initially observe a disaggregated (aggregated) earnings forecast. 


III. EXPERIMENTAL METHOD 
Participants 


Two hundred and one students enrolled in graduate business classes from two large state 
universities participated in the experiment as reasonably informed investors. Our participants have 
a reasonable understanding of business, accounting, and finance as, on average, participants had 
taken 7.61 accounting courses and 3.53 finance courses, and 91 percent of participants had used 
financial statements to evaluate a company's performance at least one time. In addition, 39 percent 
of participants stated that they had purchased common stock or debt securities, while 88 percent 
said that they planned to do so in the next five years. 

We use students enrolled in graduate business classes as proxies for reasonably informed 
investors for two primary reasons. First, as Libby et al. (2002, 802) note, sophisticated participants 
are in short supply and should only be used when the research question necessitates it. We 
matched participants with the goals of our experiment. Prior to conducting our experiment, we 
determined that (1) our task was similar to those described in Elliott et al. (2007) as low in 
integrative complexity, and (2) a reasonable understanding of financial accounting concepts and 
basic finance would be necessary and sufficient for participants to respand meaningfully to our 
experimental materials. Second, prior research that uses tasks with similar characteristics to the 
tasks in this study (and similar to those classified as low in integrative complexity by Elliott et al. 
[2007]), finds no substantive differences in the responses of professional analysts and graduate 
business students (e.g., Barton and Mercer 2005). 


4 


Experimental Design and Procedures 


To investigate our hypothesis, we use a 2 X 2 between-subjects design. Participants were 
asked to take on the role of an investor evaluating a hypothetical firm (LearningWare). Participants 
first received background information that described the company as a developer and seller of 
various software applications. After examining this background information, participants observed 
the company's forecast for the year ending December 31, 2007. 'The forecast was issued on March 
30, 2007 and included summary income statements for the years ending December 31, 2005 and 
2006 along with the forecast, and suggested a trend in performance consistent with the two 
previous years. We manipulated the format of the earnings forecast on two levels. Half of the 
participants received an aggregated earnings forecast (Aggregated condition) that aggregated the 
components of earnings such that only a bottom-line net income measure was provided in the 
forecast. The remaining participants received a disaggregated forecast (Disaggregated condition), 
in which a forecast was provided for earnings as well as its components (see Appendix А)? After 


In addition, individual investors are an important investor group in their own right. In early 2008, approximately 52.2 
million households (45 percent) owned shares of publicly traded stocks, and 16.2 million of these households held stack 
outside of employer-sponsored plans (Investment Company Institute and the Securities Industry and Financial Markets 
Association [ICI/SIFMA] 2008). Thus, financial executives and regulators should (and clearly do—http://sec.gov/about/ 
whatwedo.shtml) consider this important group of shareholders, as well as professional investors and analysts, in 
designing and regulating financial disclosures. 

The actual format of the Aggregated and Disaggregated conditions was adapted from Hirst et al. (2007). Forecasted 
earnings and its percentage increase were described in text at the top of the forecast press release in the Aggregated 
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examining management's forecast, participants used an 11-point Likert-type scale to indicate the 
likelihood that they would consider the company as a potential investment and the company's 
attractiveness as a potential investment (i.e., initial investment judgments are used as covariates in 
statistical tests). 

Upon completing these judgments, participants provided demographic information before 
continuing the experiment. Participants then received the company's earnings announcement in 
addition to the same background information and forecast release they had previously seen. The 
earnings announcements for each condition showed that reported earnings were the same as 
forecasted earnings.!! However, the summary income statement showed either higher revenue and 
research and development expense (Favorable condition) or lower revenue and research and 
development expense (Unfavorable condition) than would be expected given the trend of the 
previous two years (Appendix B).'? 

After examining the company's earnings announcement, participants again indicated the like- 
lihood that they would consider the company as a potential investment and the company's attrac- 
tiveness as a potential investment (i.e., final investment judgments and primary dependent mea- 
sure). In addition, participants answered several questions that served as manipulation checks and 
several questions designed to detect evidence of the process they used in forming their judgments. 


Highlights of Design Features 


To test a positive (versus negative or null) hypothesis for reducing earnings fixation, we 
examine when participants discern between two companies whose earnings announcements report 
identical net income, but whose net income components indicate either more or less favorable 
future prospects. Manipulating revenue to be less (more) than expected is a clear indicator of 
unfavorable (favorable) future prospects. However, in order to provide a net income figure that 
does not change, the Unfavorable (Favorable) setting requires an income-increasing (-decreasing) 
event that also can be interpreted as unfavorable (favorable). Therefore, we also choose to ma- 
nipulate R&D expense; although decreasing (increasing) R&D expense increases (decreases) net 
income, it can be interpreted as an unfavorable (favorable) indicator. 

To reduce the likelihood that differences in information across forecast aggregation levels 


condition. Forecasts of earnings components (1.е., revenue, gross profit, research and development expense, and earn- 

ings) and their percentage changes were described in text at the top of the forecast press release in the Disaggregated 

condition. In addition, as was the case in Hirst et al. (2007), the Disaggregated condition included a forecasted summary 
income statement shown alongside the summary income statements for the prior two years. 

Asking participants to provide demographic information between receiving the initial forecast and receiving the com- 

pany's earnings announcement was done to, at least partially, simulate the natural time delay between when a company 

issues an initial forecast and the subsequent earnings announcement. We acknowledge that, in the natural setting, the 
time delay would be longer and that this could potentially weaken the results we report below. However, even in the 
natural setting, investors have access to previously issued forecasts and are often prompted (by management or some 
other financial intermediary) to reconsider these forecasts in evaluating the realizations reported in the earnings an- 
nouncement. Prior accounting studies also have allowed participants to access previously issued forecasts (see Libby et 

al. 2006; Han and Tan 2007). 

An alternative design choice would have been to manipulate reported earnings and observe whether participants' 

judgments reflected the difference in reported earnings. In this case, statistically significant judgment differences would 

indicate earnings fixation and a null result would indicate reduced earnings fixation. Our choice to have identical 

earnings for all participants but manipulate the underlying components means that evidence for our hypothesis (1.е., 

` reduced earnings fixation) is associated with a significant statistical finding rather than a null finding. Thus, we believe 
our design choice provides stronger support for our hypothesis. 

? Тп order to examine boundary conditions for the impact of disaggregating the earnings forecast, we also manipulated the 
format of the earnings announcement on two levels. Half of the participants received an earnings announcement that 
emphasized net income by highlighting earnings and its percentage change in the statement heading, while the remain- 
ing participants received an earnings announcement that deemphasized net income by highlighting the components of 
earnings along with earnings itself. We describe this boundary condition test in our "Supplemental Analyses" section. 
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explain our expected findings, our setting is designed to allow participants across Forecast Ag- 
gregation conditions to develop the same expectations for the 2007 announced income statement. 
All participants observe previously announced results іп а 2007 forecast that indicate net income 
and its components for 2006 were roughly 20 percent higher than in 2005. In addition, all par- 
ticipants view a forecast of 2007 net income that exceeds 2006 net income by 20 percent. 
Thus, the expected increase in net income components in 2007 would likely be 20 percent. Our 
disaggregated forecast merely makes explicit the likely expectations for the 2007 net income 
components by forecasting values for 2007 net income components that are 20 percent higher than 
2006 amounts. In our discussion of alternative explanations іп the “Results” section, we provide 
evidence that participants had similar expectations across Forecast Aggregation conditions. 

To reduce the likelihood that the impact of disaggregating the 2007 forecast is explained by 
management signaling which items are most relevant, our disaggregated forecast provides a fore- 
cast for all components of net income. Because management in our setting displays no selectivity 
over which net income components they forecast, the likelihood that participants perceive the 
disaggregation as a signal of items that are most important is reduced. In addition to the above 
design choices, we also collect additional data to address the likelihood of these alternative 
explanations for our findings in our supplemental analyses discussed at the end of the next section. 


IV. RESULTS 

Manipulation Checks 

To determine whether participants perceived different levels of aggregation across the Aggre- 
gated and Disaggregated conditions, we asked participants to assess the level of detail in the 
forecast they observed. Using an 11-point Likert-type scale anchored on 1 = Not at АП Detailed 
and 11 — Very Detailed, participants gave mean judgments of forecast detail of 3.97 and 6.07 for 
the Aggregated and Disaggregated forecast conditions, respectively. As expected, participants 
judged the aggregated forecast to have less detail than the disaggregated forecast (t = 6.78, p < 
0.01). In addition, we asked participants to indicate the type of forecast, if any, they observed. 
Seventy-two percent of participants in the Aggregated condition correctly identified that they had 
received a forecast of net income alone, while 89 percent of participanis in the Disaggregated 
condition correctly identified that they had received a forecast of each of the items on the income 
statement, including net income, °? Together, these results suggest that pa-ticipants were sensitive 
to our manipulation of Forecast Aggregation level. 


Initial Investment Judgments 

Recall that participants provided initial investment judgments after viewing the management 
forecast (but before viewing the earnings announcement). Even though participants are exposed to 
one of our manipulated variables before providing their initial investment judgments, we did not 
expect the level of aggregation in the management forecast to influence participants’ initial invest- 
ment judgments in a predictable manner. Consistent with our expectation, results (not tabulated) 
reveal that participants’ initial investment judgments are not influenced by the level of aggregation 
in the management forecast to which they were exposed (p-value > 0.50). Thus, in the analyses 
reported below we simply include participants' initial investment judgments as a covariate. 


The Effects of Forecast Disaggregation 


Our hypothesis predicts that investors who initially observe a disaggregated management 
forecast will exhibit less earnings fixation than investors who initially observe an aggregated 


3 Our results are inferentially identical if we remove those participants who incorrectly responded to this manipulation 
check question. 
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management forecast. We test our hypothesis using participants' judgments of (1) the likelihood of 
considering the company as a potential investment, and (2) the company's attractiveness as a 
potential investment as indicated after participants have observed both the forecast and the earn- 
ings announcement. А Cronbach coefficient alpha score of 0.92 reveals that the "likelihood of 
investment" and "investment attractiveness" measures likely measure the same underlying con- 
struct. Therefore, we report the results of the hypothesis tests for a single combined investment 
judgment dependent measure based on the average of the two measures for each participant, 
adjusted for participants' initial investment judgments (after exposure to only the forecast) as a 
covariate.” 

Recall that participants in all conditions observe the same forecasted earnings and the same 
announced earnings that meet the company’s forecast. However, the components that underlie 
earnings differ. Earnings fixation is thus inferred when participants indicate the same investment 
judgments when the earnings components differ (ї.е., across the Unfavorable and Favorable 
conditions). Conversely, we infer a reduction in earnings fixation when participants’ investment 
judgments differ as the earnings components differ. 

Results of our experiment support the hypothesis. Table 1, Panel A provides adjusted least 
square means and standard errors for participants’ final investment judgments adjusted for their 
initial investment judgments as a covariate. Using an 11-point Likert-type scale, anchored on 1 = 
Not at АП Likely/Attractive and 11 = Very Likely/Attractive, participants’ final investment judg- 
ments overall are more favorable in the Favorable condition (mean = 7.30) than in the Unfavor- 
able condition (mean = 6.69). An overall ANCOVA (see Panel B of Table 1 and Figure 1), using 
participants" initial investment judgments as a covariate, reveals a significant interaction between 
the Unfavorable/Favorable and the Aggregated/Disaggregated conditions (F = 3.93, p = 0.05, 
two-tailed). 

Table 1, Panel A indicates the adjusted least square mean investment judgments for partici- 
pants in the Aggregated forecast condition are 7.58 and 7.38 for the Favorable and Unfavorable 
conditions, respectively. A simple effects test (see Panel C of Table 1) indicates that the investment 
judgments of participants in the Aggregated forecast condition are no different in the Favorable 
condition (where the components of earnings were more favorable) than in the Unfavorable 
condition (where the components of earnings were less favorable; t — 0.70, p == 0.24, one-tailed). 
Turning to participants in the Disaggregated forecast condition, the adjusted least square mean 
investment judgments are 7.02 and 6.00 for the Favorable and Unfavorable conditions, respec- 
tively. In this case, a simple effects test (Table 1, Panel C) indicates a significant difference across 
the Favorable and Unfavorable conditions (t — 3.53, p « 0.01, one-tailed). As predicted, 
participants in the Disaggregated forecast condition distinguish between the company's invest- 
ment potential when its earnings components were more favorable versus less favorable, suggest- 
ing that for these participants earnings fixation is reduced. 


14 When analyzing participants’ likelihood and attractiveness measures separately, each measure yields the same inferences 
as the combined investment judgment measure. 
A priori, the precise pattern of expected cell means is unclear. Our hypothesis specifies that the slope between Unfa- 
vorable and Favorable will differ by the level of aggregation in the management forecast, but it does not suggest a 
specific pattern of cell means. We expected participants in the Aggregated/Unfavorable and Aggregated/Favorable 
conditions to rate investment potential as more favorable than participants in the Disaggregated/Unfavorable condition. 
Specifically, participants in the two former conditions are expected to be more influenced by the realization of the 
predicted increase in net income, while those in the latter conditions are expected to be more influenced by the failed 
realizations of predicted increases in revenues and R&D. It is less clear how the investment judgments of participants 
in the Disaggregated/Favorable condition would fall relative to the other conditions. While these participants observed 
that the company surpassed expectations in reported revenues and spending on R&D, it is somewhat ambiguous as to 
whether the increased spending on R&D is positive or negative news. 
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FIGURE 1 

Effects of Forecast Disaggregation on Investment Judgments 
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Participants judged their likelihood of considering the company as a potential investment (the attractiveness of 
the company as an investment) using an 11-point scale anchored on 1 = Not at All Likely (Attractive) and 11 = 
Very Likely (Attractive). Participants made initial judgments after receiving an Aggregated (Disaggregated) 
management forecast of earnings only (earnings and earnings components) and made the same judgments again 
after receiving an earnings announcement revealing the realization. Participants observed announced earnings 
that matched forecasted earnings, but whose revenues and R&D expense depicted either Unfavorable or Favor- 
able performance. The adjusted least square means reported above were derived by averaging the likelihood and 
attractiveness responses provided after observing both the forecast and the realization for each participant, 
adjusted for participants’ initial investment judgments (after observing only the forecast) as a covariate. 


Evidence for Our Theory 


Our theory suggests that a disaggregated forecast’s ability to reduce earnings fixation is 
associated with investors who observe a disaggregated forecast interpreting net income to be one 
of several similarly important inputs in evaluating a company’s prospects more than investors who 
observe an aggregated forecast. To examine this theory, we asked participants to allocate 100 
points among the different components of the summary income statement, based on how important 
they deemed each item for evaluating the company as a potential investment. Fewer (more) points 
allocated to net income (net income components) would be consistent with participants viewing 
net income as one of several evaluation inputs rather than as a sufficient summary measure of the 
company’s prospects. 

The mean points allocated to net income are 26.30 and 22.82 for participants in the Aggre- 
gated and Disaggregated forecast conditions, respectively (see Table 2, Panel A). Results of an 
ANOVA test (Table 2, Panel B) reveal a main effect for forecast aggregation (F = 3.44, p = 0.03, 
one-tailed), indicating that participants in the Disaggregated condition allocate fewer points to net 
income than participants in the Aggregated condition. The result supports our theory about why 
disaggregated forecasts reduce earnings fixation.' 


16 Although an examination of the means reported in Panel A of Table 2 reveals that the points allocated to net income are 
directionally higher under the Disaggregated/Favorable condition than the Disaggregated/Unfavorable condition, as 
revealed in Panel B of Table 2, there is not a significant Favorable/Unfavorable by Forecast Aggregation interaction (p 
- 0.30). 
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To supplement this evidence, we examine the spread of points allocated across net income and 
its components. If participants in the Disaggregated (Aggregated) forecast condition tend to con- 
sider net income more (less) as one of several inputs to evaluating the company, we expect the 
points they assign to the summary net income measure and to the most important net income 
component to be more (less) similar. Therefore, іп an additional analysis, we calculate a ratio for 
each participant that divides the points assigned to the summary net income measure by the points 
assigned to the most important net income component (the component with the greatest number of 
assigned points). Our theory is supported if the calculated ratio is significantly greater in the 
Aggregated forecast condition than in the Disaggregated condition. The means of the calculated 
ratio are 1.25 and 1.03 (Table 2, Panel C) for the Aggregated and Disaggregated conditions, 
respectively, and are marginally different (F = 2.34, p = 0.06, one-tailed, Tablé 2, Panel D). These 
findings suggest that participants in the Aggregated condition assign 25 percent more points to the 
summary net income measure than the points they assign to the most important net income 
component, whereas participants in the Disaggregated condition assign a nearly equal number of 
points to net income and the most important net income component. Thus, participants in the 
Disaggregated condition appear to be more likely than participants in the Aggregated condition to 
consider net income as one of several similarly important inputs, further supporting our theory." 


Benchmark Framework as an Alternative Theoretical Explanation 


Prior research in accounting has considered a benchmark framework when examining the 
impact of management forecasts on investor judgments (e.g., Krische 2005; Libby et al. 2006; Han 
and Tan 2007). An assumption underlying the benchmark framework as applied in investment 
settings is that investors' judgments might be sensitive to the number of benchmarks missed, met, 
or exceeded such that when a greater number of benchmarks are met (missed), investors perceive 
the company as a more (less) favorable potential investment (Han and Tan 2007). Because fore- 
casts represent benchmarks of company performance, a benchmark framework represents an at- 
tractive approach to deriving predictions in our setting as an alternative to our theoretical frame- 
work. 

In applying a benchmark framework to our experimental setting, one could plausibly reason 
that disaggregated forecasts highlight different benchmarks than aggregated forecasts. For ex- 
ample, participants in the Disaggregated forecast condition could use the forecasted line item 
amounts as benchmarks, while those in the Aggregated forecast condition could use prior year line 
item amounts as benchmarks because they observe no explicit forecasts to serve as benchmarks. 
More specifically, comparing the net number of benchmarks exceeded when prior year, line item 
amounts are used as benchmarks to the net number of benchmarks exceeded when the forecasts of 
line items are used as benchmarks can yield a pattern of investment judgments similar to our 
findings. 

Our experimental design does not allow us to discern clearly whetaer participants in our 
setting adopted a processing approach more consistent with our theoretical framework or a bench- 
mark framework. Improving our understanding of the processing approach investors adopt when 
presented with forecasted information (and subsequent realizations) is important given the increas- 
ing frequency and importance of forecasts. Thus, parsing out these two theoretical perspectives 
represents a promising avenue for future research. 


17 In addition, the proportion of participants who assigned their greatest number of points to Net Income was greater in the 
Aggregated condition (42 percent) than in the Disaggregated condition (33 percent). A one-sided Fisher Exact test (not 
tabulated) indicates that the difference in proportion is marginally significant (p = 0.09). 


The Accounting Review January 2011 
` American Accounting Association 





Disaggregating Management Forecasts 201 


V. SUPPLEMENTAL ANALYSES 
In this section, we examine whether the earnings-fixation-reducing effect of a disaggregated 
forecast is bounded by characteristics of the earnings announcement. We also provide evidence 
that our results are not fully explained by three plausible alternative explanations, namely, (1) 
perceived differences in information availability, (2) a pure management signal of the importance 
of certain financial statement items, or (3) perceived differences in forecast credibility across the 
level of forecast aggregation. 


The Boundary Effects of Empbasis in Earnings Announcements 


Although our study focuses on the ability of forecast disaggregation to reduce earnings fixa- 
tion in investors' judgments, we include an examination of earnings announcement characteristics 
to examine boundary conditions for our hypothesis. Specifically, we consider whether an earnings 
announcement that emphasizes a bottom-line net income number can hinder the ability of a 
disaggregated forecast to reduce investors' earnings fixation. Investment professionals suggest that 
the extent to which a report emphasizes net income can affect investors' tendency to fixate on net 
income. For example, financial analysts assert that the power of a summary earnings number in 
proposing to eliminate any summary earnings from the income statement, in part, reduces "the 
focus on a single, arbitrary performance indicator, accounting net income" (CFA Centre for Fi- 
nancial Market Integrity 2007, 30). 

We include an Earnings Emphasis manipulation in our experiment in which the earnings 
announcement either emphasizes net income by highlighting earnings and its percentage change in 
the press release heading, or deemphasizing net income by including in the press release heading 
the components of net income together with net income itself, and the respective percentage 
changes from the prior period.'® We then expand the ANCOVA model (untabulated) used to test 
our hypothesis to include all main and interactive effects of Earnings Emphasis on investors’ 

_ judgments. Results indicate that whether the earnings announcement emphasizes or deemphasizes 
earnings has no impact on the ability of disaggregated forecasts to reduce earnings fixation. 
Specifically, an expanded ANCOVA (untabulated) reveals no three-way interaction between Fore- 
cast Aggregation, Earnings Emphasis, and Unfavorable/Favorable (Е = 0.26, p = 0.61, two- 
tailed). Indeed, in observing the adjusted least square mean investor judgments, the disaggregated 
forecast appears to reduce earnings fixation when the earnings announcement emphasizes earnings 
(Unfavorable = 6.24 versus Favorable = 7.25) by nearly an identical amount as when the 
announcement deemphasizes earnings (Unfavorable = 5.76 versus Favorable = 6.76). Moreover, 
we find no significant interaction between the Unfavorable/Favorable and the Earnings Emphasis 
conditions (F = 0.22, p = 0.64, two-tailed) or the Forecast Aggregation and Earnings Emphasis 
conditions (F = 1.07, p = 0.30, two-tailed). Thus, earnings emphasis in the announcement does 
not appear to affect the reduction in fixation provided by disaggregating an earnings forecast.’ 


Differences in Available Information 


We now consider the possibility that the effect of forecast disaggregation on reducing earn- 
ings fixation merely reflects a difference in the forecast information available to participants. As is 


18 Using a Likert-type scale anchored on 1 = Not at all detailed and 11 = Very detailed, participants judged the detail of 
the emphasized announcement (5.44) and the deemphasized announcement (6.10) to be significantly different (t = 2.21, 
p = 0.03). 

19 We had 24 participants take part in an experiment that was equal to our main experiment with the significant exception 
that no forecasts were given. Thus, this was a direct test of earnings emphasis alone. All 24 participants were placed in 
the Unfavorable condition and were randomly assigned to earnings announcements that either emphasized or deempha- 
sized earnings. We found no significant effect for Earnings Emphasis in participant judgments (F = 1.13, p = 0.30, with 
initial judgments as the covariate) or points allocated to net income (t = 0.86, p = 0.40). Thus, in our setting, it does 
not appear that Earnings Emphasis alone eliminates (or even significantly reduces) earnings fixation. 


The Accounting Review January 2011 
American Accounting Association 


202 Eliott, Hobson, and Jackson 


inherently the case in natural settings (and following Hirst et al. [2007]), disaggregated forecasts 
in our experiment specify a greater number of items than do aggregated forecasts. Thus, one could 
argue that participants who observed aggregated forecasts ignored the net income components 
because the lack of forecast information made it impossible to assess the components as positive 
or negative news. This is likely not the case, however. First, if participants in the Aggregated 
condition were unable to evaluate the net income components, then we would expect some of 
them to assign approximately zero points to the net income components. The results, in contrast, 
show that all participants assigned some points to the net income comporents. Second, we chose 
the values used in the disaggregated forecast to represent an extension of the linear trend of the 
2005 and 2006 income statement amounts. This should allow participants in the Aggregated 
condition to derive an expectation for the components of net income similar to the expectations 
derived by those in the Disaggregated condition.” Additional data below suggest that participants 
in both aggregation levels derive similar expectations. 

An investor's surprise following a company announcement represents the extent to which 
reported amounts deviate from the investor's expectations. Therefore, if participants in the Aggre- 
gated condition did not form expectations of revenue, research and development, and net income 
(because they received no forecasts for earnings components), then we would expect them to be 
less surprised at the amounts reported for these items in the earnings announcement than partici- 
pants in the Disaggregated condition (who received forecasts of the components). To examine this 
proposition, we ask participants the following question with regard to revenue, research and 
development, and net income: “How surprised were you by the amount of Revenues/R&D/Net 
Income reported for 2007?” Participants answered on a Likert-type scale anchored on 1 = “Not at 
АЛ Surprised" and 11 = "Very Surprised." The mean surprise for announced net income is 5.52 
and 5.11 for those in the Aggregated and Disaggregated forecast conditions, respectively. The 
mean surprise for announced revenue is 5.47 and 5.82 for those in the Aggregated and Disaggre- 
gated forecast conditions, respectively. And the mean surprise for announced R&D is 6.24 and 
6.52 for those in the Aggregated and Disaggregated forecast conditions, respectively. An ANOVA 
(not tabulated) reveals no significant main effect for Forecast Aggregation (all p-values > 0.30), 
nor any significant interaction effects (all p-values > 0.49) for any of these dependent measures.” 
Thus, our findings do not seem to be the result of participants who received an aggregated forecast 
not developing expectations for the components of net income.” 


Pure Management Signal of Importance of Certain Financial Statement Items 


We next consider whether our observed reduction in fixation is driven purely by participants’ 
perception that the level of forecast aggregation is a signal from management regarding the 
importance of certain financial statement items. In natural settings, companies can be selective 
about what items to include in the forecast (see Exhibit 1). In such cases, it is reasonable for 


20 Our design choice of a linear trend was made in an attempt to maximize internal validity. Of course, in the natural 
setting, forecasts of income statement line items do not always match the prior linear trends of those line items. 
However, we see no reason why our inferences would not extend to such a setting. 

21 Importantly, the level of surprise expressed by participants in the Unfavorable condition significantly differed from that 
expressed by participants in the Favorable condition for revenues (p = 0.02), R&D (p < 0.01), and net income (p = 
0.07). This indicates that our null result for aggregation was not due to our measure for surprise being inadequately 
sensitive to detect differences in participants’ levels of surprise. 

22 To confirm that participants across aggregation levels attended to the amounts reported эп the earnings announcement, 
we asked participants to recall the amounts reported for 2008 revenue, R&D Expense, end net income. We also asked 
them to recall how much 2007 amounts for these line items changed in 2008. There were no differences across Forecast 
Aggregation conditions for any of these recall measures (all p-values > 0.10). 
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investors to conclude that managers might use a disaggregated forecast to highlight those items 
that are particularly relevant and exclude items that are less relevant. However, our experimental 
design makes such signaling less likely; recall that our disaggregated forecast includes all items 
reported in the subsequent earnings announcement. The apparent lack of management selectivity 
should reduce the likelihood that participants perceived the disaggregation as a signal of the 
particular items that were most important. 

Evidence from our data also runs counter to a signaling explanation for our findings. Recall 
that, in addition to manipulating the aggregation level of forecasts, we also manipulated whether 
a subsequent earnings announcement emphasized net income. Emphasizing net income or net 
income components in an announcement would suggest a similar management signal to any signal 
ascribed to the aggregation level of a forecast. However, as noted previously, we find no evidence 
that emphasizing earnings in an earnings announcement reduces earnings fixation. Thus, a strict 
signaling explanation for our overall findings is unlikely. 


Forecast Credibility 


Hirst et al. (2007) find that forecast aggregation affects investors’ judgments of forecast 
credibility. Therefore, we also consider whether our results can be explained by differences in 
forecast aggregation causing differences in perceived forecast credibility.” Similar to Hirst et al. 
(2007) we ask participants to judge the credibility of the forecast provided. Participants provided 
responses on a Likert-type scale with end points, 1 = Not at All Credible and 11 = Very Credible. 
It should be noted that Hirst et al. (2007) elicit participants’ judgments of forecast credibility 
directly after observing a management forecast (and the participants never observe a subsequent 
earnings announcement). In contrast, we do not elicit judgments of forecast credibility until after 
participants observe both the management forecast and the subsequent earnings announcement (so 
as not to explicitly cue participants to think about forecast credibility before viewing the subse- 
quent earnings announcement and making their final investment judgments). 

Our participants’ mean ratings of forecast credibility are higher when they observe a disag- 
gregated forecast (mean = 7.15) as compared to an aggregated forecast (mean = 7.00), direction- 
ally consistent with the results of Hirst et al. (2007). However, an ANOVA (not tabulated) reveals 
no main or interactive effects when forecast credibility is the dependent measure and our three 
independent variables and all of their interactions are included. Thus, it does not appear that 
differences in participants’ judgments of forecast credibility are driving our results. In addition, it 
is not surprising that our results for the effect of forecast aggregation on participants’ judgments of 
forecast credibility are not as strong as those reported in Hirst et al. (2007), as our participants 
viewed a realization of the forecast before evaluating forecast credibility. 


23 Yn addition to the credibility question described below, we asked related questions about management's trustworthiness, 
competence, and credibility; questions about management's opportunities, motives, and incentives to manage earnings; 
questions about how constrained management was to manage earnings; and a question about the quality of reported net 
income. No significant differences across aggregation levels were noted for any of the measures (all p-values > 0.15, 
two-tailed). However, participants" responses did differ across the Unfavorable and Favorable conditions, as expected. 
Specifically, participants in the Unfavorable condition who observed a cut in R&D (as opposed to an increase in R&D) 
rated management as less trustworthy (p = 0.08, one-tailed), less forthcoming (p = 0.01, one-tailed) and less competent 
(p 0.01, one-tailed) than those in the Favorable condition. In addition, participants in tbe Unfavorable condition rated 
the quality of net income lower than those in the Favorable condition (p — 0.06, one-tailed). These latter results suggest 
that participants did attend to management characteristics and the related quality of net income associated with cutting 
R&D (as opposed to increasing R&D) to arrive at a forecasted realization; however, our participants did not believe that 
these characteristics differed by the level of forecast aggregation. Finally, it is unclear, ex ante, if participants' responses 
to the questions about management's opportunities, motives, and incentives to manage earnings, and how constrained 
management was to manage earnings would differ across the Unfavorable and Favorable conditions. In fact, no 
significant differences were observed (all p-values > 0.10, two-tailed). 
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УІ. CONCLUSION 

This study examines a mechanism intended to reduce investors' susceptibility to earnings 
fixation (i.e., excessive reliance on accounting earnings as the primary input in investment judg- 
ments, without fully considering other information that may be relevant to evaluating the compa- 
ny's investment prospects). We find that investors are less susceptible to earnings fixation when 
they initially observe a disaggregated management forecast (which forecasts earnings and its 
components) than when they observe an aggregated forecast (which forecasts only earnings). 

Specifically, participants' investment judgments reflect less earnings fixation when they ini- 
tially observe a disaggregated management earnings forecast. This reduction in earnings fixation 
appears to be associated with investors interpreting the summary net income figure as one of 
several similarly important evaluation inputs rather than a substantially more important input 
(relative to its components). We provide additional evidence that our results cannot be fully 
explained by several plausible alternative explanations: (1) perceived differences in information 
availability across levels of aggregation, (2) perceived management signals of the importance of 
certain financial statement items, or (3) perceived differences in forecast credibility across levels 
of aggregation. Finally, we provide evidence that suggests our results are not bounded by the level 
of emphasis on net income in the subsequent earnings announcement. 

As with all experimental research, our study is not without limitations. Participants in our 
study did not have access to all the information they might have available in a natural setting. 
Moreover, a limitation specific to our study is that an aggregated (disaggregated) forecast, by 
definition, provides investors with a single (multiple) benchmark(s). Thus, the level of forecast 
aggregation, in our experiment as well as in the real world, is inexorably linked with the number 
of benchmarks in the forecast, forming a natural environmental confound. In addition, while we 
report evidence that a reduction in earnings fixation results from a disaggregated forecast activat- 
ing an investor-held knowledge structure, we cannot determine whether this activation occurs 
prior to or subsequent to investors observing the earnings announcemen:. Finally, because man- 
agers may strategically issue a disaggregated forecast to influence investor perceptions of firm 
value, providing disaggregated forecasts may not always lead investors to optimal judgments. 

Despite these limitations, our results provide important insights regarding a potential mecha- 
nism to reduce investors' susceptibility to earnings fixation—an institutional phenomenon that can 
be undesirable to managers and investors. In addition, our study extends the disclosure literature in 
accounting by providing evidence that a forecast (and its characteristics) can influence how in- 
vestors interpret subsequently reported accounting information. Our findings should be of interest 
to managers, who have incentives to reduce pressures for myopic behavior, investors, whose 
investment returns may be adversely affected by such myopic behavior, and regulators, who have 
expressed concern about investors’ susceptibility to earnings fixation and the resulting impact on 
manager behavior. 


APPENDIX À 
DISAGGREGATED MANAGEMENT FORECAST 
LearningWare Provides Forecast for 2007 
Tallahassee, FL~--LearningWare today provided a forecast for the full year, which ends on 
December 31, 2007. LearningWare expects revenues to increase 20 percent to $43.2 million. The 
gross profit margin is projected to remain at 49 percent. Research and development expenses are 


24 Тре forecast shown is for the Disaggregated condition. Participants in the Aggregated condition received only the last 
sentence of the text and received only the audited 2005 and 2006 summary income statements. 
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expected to increase 19 percent to $6.9 million. Selling, general, and administrative expenses are 
expected to increase 20 percent to $4.9 million. LearningWare expects 2007 net income to in- 
crease 22 percent to $6.1 million. 


Income Statement (all figures in millions) 


Fiscal Year ending December 31 

Audited Unaudited 
2007 

2005 2006 (Forecast) 
Revenue $30.0 $36.0 $432 
Cost of Sales 15.3 18.4 22.0 
Gross Profit 14.7 17.6 21.2 
Research and Development Expense 4.9 5.8 6.9 
Selling, General, and & Administrative Expense 3.4 4.1 4.9 
Earnings Before Income Taxes 6.4 77 9.4 
Income Tax Expense 2.2 2.7 3.3 
Net Income $4.2 $5.0 $6.1 


About LearningWare: LearningWare (http://www.learningware.com) develops and sells various 
software applications that help schools educate and evaluate students and help businesses assess 
the skills of their employees. This news release contains forecasted information that is not audited. 


APPENDIX B 
UNFAVORABLE/FAVORABLE ANNOUNCED INCOME STATEMENTS” 


Panel À: Unfavorable Condition 


Audited 

2005 2006 2007 
Revenue $30.0 $36.0 $39.6 
Cost of Sales 153 184 20.2 
Gross Profit 14.7 17.6 19.4 
Research and Development Expense 4.9 5.8 5-1 
Selling, General, and Administrative Expense 3.4 4.1 4.0 
Earnings Before Income Taxes 6.4 7.1 9.4 
Income Tax Expense 2.2 2.7 3.3 
Net Income $4.2 $5.0 $6.1 


(continued on next page) 


25 Panels A and B indicate the summary income statements participants observed when in the Unfavorable and Favorable 
announcement conditions, respectively. Amounts highlighted (were not highlighted in actual experiment) indicate the 
fundamental differences between the two conditions although net income is identical. Earnings fixation is reduced 
(enhanced) to the extent that participants' investment judgments discern (do not discern) between the Unfavorable and 
Favorable income statements. 
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Panel B: Favorable Condition 


Audited 

2005 2006 2007 
Revenue $30.0 $36.0 $46.8 
Cost of Sales 15.3 18.4 23.9 
Gross Profit 14.7 17.6 22.9 
Research and Development Expense 4.9 5.8 8.6 
Selling, General, and Administrative Expense 3.4 4.1 4.9 
Earnings Before Income Taxes 6.4 77 9.4 
Income Tax Expense 22 2.7 — 53:3. 
Net Income | $4.2 $5.0 ^ $61. 
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ABSTRACT: We document how the effectiveness of an accruals-based trading strat- 
egy changes with the benchmark used to identify an extreme accrual. We measure 
“percent accruals” as accruals scaled by earnings, rather than total assets, and show 
that this seemingly small change produces a radically different sort of the data. We find 
that a trading strategy based on percent accruals yields significantly larger annual 
hedge returns than the traditional accruals measure, and does so mostly by improving 
the long position in low-accrual stocks. The hedge returns are also significant in all but 
the lowest quintile of arbitrage risk. We show that percent accruals more effectively 
select firms where the difference between sophisticated and naive forecasts are the 
most extreme. As such, our results are consistent with the earnings fixation hypothesis 
and are inconsistent with some alternative explanations for the accrual anomaly. 


Keywords: accruals; market inefficiency; abnormal retums. 


Data Availability: Data used in this study are available from the sources identified in 
the text. 


I. INTRODUCTION 

he accruals anomaly, originally identified by Sloan (1996), is one of the most important 

| findings in accounting research. By purchasing firms with low accruals and selling short 
firms with high accruals, Sloan (1996) documents a significant hedge return. A large lit- 
erature has followed exploring the cause of this market inefficiency, but also questioning the size 
and persistence of the anomaly. In the vast majority of these studies, accruals are measured as 
earnings less cash from operations, scaled by average total assets. In this study we propose a small 
change in the definition of accruals that yields surprisingly different results. Instead of scaling 
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accruals by average total assets, we scale by earnings. We label this definition of accruals as 
“percent accruals” and argue that it is a more natural interpretation о: the original idea that 
investors are fixated on reported earnings and do not distinguish between accruals and cash flows. 

Percent accruals identify an extreme observation as one for which the accrual component 
makes up a large positive or negative fraction of the total earnings for that уеаг.! This measure of 
accruals focuses on the composition of earnings, distinguishing how much is cash and how much 
is accrual. We show that this seemingly simple change in the definition of an extreme accrual 
produces a radically different sort of the data; only 12 percent of the data in the first decile of the 
traditional accruals measure are in the first decile of our new measure. By simply changing the 
benchmark for an extreme accrual, we find a hedge return that is much larger than the return based 
on the traditional definition of accruals, and is particularly successful for the long position. 

The percent accruals measure differs most from the traditional accruals measure in the lowest 
decile—the firms for which a trading strategy takes a long position. The returns to this part of a 
hedge trading strategy are particularly important because the transaction costs and other limits to 
arbitrage are significantly lower on a long position than on a short position. We find a 5.53 percent 
abnormal return to the long position in the first decile of percent operating accruals that is sig- 
nificant at the 0.001 level. This return is four times as large as the 1.27 percent abnormal return in 
the first decile of the traditional accruals measure, which is not significantly different from zero.? 
Further, the firms in the first decile of percent operating accruals have an average market value of 
$1,510 million, as compared to only $474 million for the first decile of the traditional operating 
accruals measure. This suggests that the transaction costs to trading on a percent accruals anomaly 
are probably much lower than for the traditional accruals measure, and make the evidence for 
inefficient pricing of percent accruals even more compelling. In addition, we show that the hedge 
returns based on percent accruals are significant in all but the lowest quintile of arbitrage risk, 
making the results even harder to reconcile with an efficient market. 

Because percent accruals—accruals measured as a percent of absolute income—is a new 
definition of accruals, we examine what types of firms end up in the extreme high and low 
portfolios of this measure, and why they might experience excess positive or negative returns in 
the future. The typical firm in the extreme negative percent operating accrual portfolio has large 
positive cash from operations, but then accrues income back down to something much closer to 
zero; similarly, to end up in the extreme positive percent operating accrual portfolio, a typical firm 
has large negative cash from operations, but accrues income up to something near zero. We show 
that these extreme combinations of cash from operations and accruals are exactly the combinations 
that produce the most extreme differences between a sophisticated income forecast—one that 
distinguishes between cash flows and accruals—and a naive forecast. As a result, the percent 
accruals measure is arguably superior to the traditional accruals measure at identifying cases for 
which the earnings fixation hypothesis predicts the largest mispricing. 

The percent accruals measure has other advantages. We show that the extreme percent accru- 
als are not disproportionately sensitive to special items, as Dechow and Ge (2006) report for 
traditional operating accruals, and the hedge returns are equally large ia the subsample with or 
without special items. We also show that the long position is more effective in the subsample of 


In order to assure that the sign of the accrual is maintained when we sort firms into porttolios, we scale by the absolute 
value of net income. Because net income can be at or near zero, the result will sometimes be extreme. Note, however, 
that we only use this variable to sort firms into ten portfolios; beyond placing a firm into the first or tenth portfolio, the 
extremity of the variable is irrelevant. 

As discussed in the next section, comparing the hedge returns from different accruals studies is difficult due to a 
prevalent look-ahead bias documented in Kraft et al. (2006). 
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firms with losses than those with gains, unlike the traditional accrual measure that has been shown 
to have no predictive power for future returns when the sample is limited to loss firms (Dopuch et 
al. 2009). 

Aside from offering new evidence consistent with the accruals anomaly, our study contributes 
to the growing literature that attempts to explain the cause of the anomaly. The main question in 
this literature is whether the anomaly is driven by investors' failure to understand the different 
mean-reversion tendencies of cash flows versus accruals—commonly known as the earnings fixa- 
tion hypothesis. One inconsistency between prior accrual anomaly results and this hypothesis is 
that existing studies find very little evidence of mispricing in the low-accrual portfolio? An 
alternative to the earnings fixation hypothesis is offered by Kothari et al. (2006). They argue that 
managers of overpriced stocks manipulate earnings up to maintain the overvaluation for as long as 
possible, the market is fooled by these positive accruals, and this causes the high-accrual portfolio 
to earn a significant negative return when the accruals reverse and the market corrects itself. In 
contrast, managers of undervalued firms have no incentive to manipulate earnings down, and 
consequently there is no predicted undervaluation for the low-accrual portfolio, consistent with the 
near-zero returns observed in this portfolio. Our results reinstate the significant underpricing of the 
low-accrual portfolio when accruals are measured relative to earnings instead of total assets. This 
is inconsistent with the asymmetric prediction of Kothari et al.’s (2006) agency hypothesis. More 
generally, since neither the earnings fixation hypothesis nor the agency hypothesis is conditioned 
on exactly how accruals are scaled, our results suggest that a more refined theory is in order. 

Another competitor to the earnings fixation hypothesis is offered by Fairfield et al. (2003), 
who note that when accruals are scaled by assets, the resulting measure is essentially the percent- 
age change in operating assets. If investors fail to understand that investments tend to have 
diminishing returns, then they may overprice large increases in assets and underprice large de- 
creases in assets, Rather than fixating on earnings, investors simply fail to understand the eco- 
nomics of investment. Our results do not invalidate this interpretation of the past accruals litera- 
ture or the evidence supporting the “growth” hypothesis, but this interpretation does not apply to 
our rescaled measure of accruals. Finally, as discussed above, the percent accruals measure pro- 
duces a better implementation of the trading strategy predicted by the earnings fixation hypothesis 
than the traditional accruals measure, and it generates superior excess returns, lending further 
evidence in support of the earnings fixation hypothesis. 

Two other studies are related to ours. Dopuch et al. (2009) show that the traditional accrual 
anomaly holds only for profit firms; the hedge return in the loss firm sample is not statistically 
different from zero. This finding is due primarily to loss firms with large negative accruals; this 
portfolio is supposed to have positive excess returns in the accrual anomaly, yet the returns are 
—0.2 percent in the loss sample. Firm-years with losses comprise roughly one third of the sample, 
so this is not an inconsequential limitation. In contrast, the return to a hedge strategy based on 
percent accruals is slightly larger in the loss subsample than in the gain subsample, and is signifi- 
cant in both samples. 

Cheng and Thomas (2005) assess 22 different abnormal accruals models and 5 different 
design choices as a comprehensive investigation into the specifications used to study the accruals 
anomaly. They find significant variation in the size of the anomaly over the 704 permutations 
considered. However, all of their accruals models are scaled by some measure of firm size, either 
total assets or market value, but never by earnings. 


3 See Kraft et al. (2006) and Zach (2004) specifically for this topic, as well as Beaver et al. (2007), Beneish and Vargus 
(2002), D' Avolio et al. (2002), Houge and Loughran (2000), and Lesmond and Wang (2006). 
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In the next section we describe our new definition of extreme accruals, in Section III we 
discuss the sample properties of the different accrual measures, and in Section IV we present stock 
return results based on our new measures. We investigate the cause of the percent accruals’ success 
in Section V and conclude in Section VI. 


II. IDENTIFYING EXTREME ACCRUALS 
To begin we define the two "traditional" accrual measures found in the literature. The most 
common definition is: 


Traditional Operating Accruals = (Net Income 
— Cash from Operations)/Average Total Assets. 


Sloan’s (1996) original definition was based on balance sheet changes but, starting with Collins 
and Hribar (2002), the data from the statement of cash flows have become the preferred definition 
because they exclude the effects of acquisitions and foreign currency translation adjustments. We 
use the same variables as in Kraft et al. (2006), calculating traditional operating accruals as the 
difference between net income (Compustat item 172) and cash from operations (item 308), then 
dividing by the average of total assets (item 6). 

Recently, Richardson et al. (2005) introduced a more comprehensive measure of accruals, 
specified as: 


Traditional Total Accruals = [Net Income 
— (Net Dividends and Distributions to/from Equityholders 
+ increase in the cash balance) ]/Average Total Assets. 


Traditional total accruals are also computed using the cash flow statement, as in Richardson et al. 
(2005), calculated here as the difference between income before extraordinary items (item 172) 
less total operating, investing, and financing cash flows (items 308, 311, and 313) plus sales of 
common stock (item 108) less stock repurchases and dividends (items 115 and 127), divided by 
the average of total assets (item 6). 

Note that both definitions of accruals, and all those we could find in the literature (see Cheng 
and Thomas 2005), scale by some measure of firm size, in this case total assets. So firms in the 
first decile of accruals have large negative accruals relative to their size. Another natural bench- 
mark for extremity would be relative to the firm's earnings for the period. By computing accruals 
as a percent of earnings, the measure focuses on the composition oi earnings—the relative 
amounts of cash and accruals. To assure that negative accruals are in the lower portfolios and 
positive accruals are in higher portfolios, we scale by the absolute value of earnings. That is, our 
percent accruals measures are: 


Percent Operating Accruals = (Net Income — Cash from Operations)/|Net Income| 
and 
Percent Total Accruals = [Net Income 
— (Net Dividends and Distributions to/from Equityholders 
+ increase in the cash balance) |/[Net Incore]. 


For the percent accruals measures, firms in the first accruals decile have large negative 
accruals and small positive or small negative earnings; this combination yields a large negative 
numerator and a small denominator. The cutoff to be in the lowest decile of percent operating 
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accruals varies by year, ranging from —5.7 to —3.1. This implies that all the firms in this decile 
have positive cash from operations and their negative accrual pushes their net income back toward 
zero (if cash from operations was negative, then a negative accrual would only push net income 
further from zero, making the denominator greater than the numerator and the resulting value 
greater than — 1). Similarly, firms in the highest decile of percent accruals tend to have negative 
cash from operations and their positive accrual pushes their net income back toward zero. Because 
the denominator is the absolute value of net income, and net income can be near zero, some values 
of the percent accruals are extreme.* Note, however, that this variable is only used to sort firms 
into portfolios; extreme values have no more impact on the results than do the more moderate 
values within the same portfolio. 

Consider the examples of Georgia Pacific and Supervision, as shown in Figure 1, which are 
taken from the lowest decile of percent operating accruals. Both examples illustrate how a large 
positive cash flow and a large negative accrual combine to yield an extreme negative value of 
percent operating accruals. These examples also illustrate why the percent operating accruals 
measure is insensitive to firm size, unlike the traditional operating accruals measure: Georgia 
Pacific has earnings and cash flows (and unreported total assets) that are orders of magnitude 
greater than Supervision's numbers, yet they have almost identical percent operating accruals. As 
we show later, this results in significantly larger firms in the first decile of percent operating 
accruals than in the first decile of traditional operating accruals. These examples also illustrate 
why it makes sense to use the absolute value of earnings in the denominator, as one would not 
want to treat these two very similar examples differently. Finally, these examples illustrate why 
very poorly performing firms are unlikely to be sorted into the first decile of a percent accruals 
measure. Firms in the lowest percent operating accruals decile have positive cash flows and 


FIGURE 1 
Examples from the First Decile of Percent Operating Accruals 


Georgia Pacific 2001 Accrual 


0 343 1747 


Percent Operating Accrual = (343 — 1747)/343| = -4.09 


Supervision 2001 


-259 0 8 
Percent Operating Accrual = (-.259 — .8)/|-.259] = -4.08 





^ Five observations have zero net income, We sorted these into either decile 1 or decile 10, depending on the sign of the 
accrual. 
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earnings near zero, while firms in the lowest traditional operating accruals decile have large losses 
and large negative cash flows. Consequently, sorting firms by percent accruals measures will 
spread poor performing observations across the distribution, rather than concentrate them in the 
lowest decile. We document these empirical regularities in the next section. 


III. SAMPLE AND DESCRIPTIVE STATISTICS 

We use the merged 2008 Compustat/CRSP database on the WRDS system to collect all firms 
with available data to construct our accrual measures. Consistent with prior studies, we exclude 
financial firms. The only data requirements are that the firm has sufficient financial statement data 
on Compustat to compute traditional operating accruals in the current year, and it has return data 
available at the portfolio formation date (the first day of the fourth month after the fiscal year-end). 
The result is a sample of 81,526 firm-years spanning the years 1989—2008. As discussed later, we 
use the prior year's decile breaks to form portfolios, resulting in 19 years of annual portfolio 
returns for our returns tests. Table 1 reports descriptive statistics. Because our data requirements 
are so minimal, the sample is representative of the nonfinancial market as a whole for this time 
period. 


TABLE 1 
Descriptive Statistics 
Standard Lower Upper 
Variable Mean Median Deviation Quartile Quartile 
Percent operating accruals — 1.6860 —0.6753 5.2630 - 1.6040 —0.0644 
Traditional operating accruals —0.0670 —0.0517 0.1345 —0.1077 —0.0059 
Percent total accruals —0.8576 0.1737 5.2010 — 1.0090 1.0000 
Traditional total accruals —0.0730 0.0084 0.2784 —0.0986 0.0668 
Return on Assets —0.0326 0.0311 0.2235 —0.0503 0.0764 
Cash Return on Assets 0.0349 0.0685 0.1765 — 0.0062 0.1285 
Total Cash Flow Return on Assets —0.1077 —0.0017 0.2875 —0.0944 0.0341 
Market Value of Equity 1,696 138 5,448 30 743 
Book Value/Market Value 0.6310 0.4949 0.6238 0.2698 0.8236 
Price Per Share 16.61 10.72 17.1€ 3.88 24.00 
Three-Year Sales Growth 0.1076 0.0589 0.4795 -0.1025 0.1869 
Proportion of firms with losses 0.3445 1.0000 0.4752 0.0000 1.0000 
% of Firms with Special Item <0 0.1988 0.0000 0.3991 0.0000 0.0000 
Arbitrage Risk 0.1516 0.1331 0.084С 0.0912 0.1892 


Тһе sample period is 1988-2008, consisting of 81,526 firm-years. Traditional operating accruals are defined as Net Income 
(Compustat item 172) less Cash from Operations (item 308) divided by average total assets (item 6). Traditional total 
accruals are computed as Net Income less Net Dividends/Distributions (item 115 -- item 127 — item 108) and the change 
in cash (item 308 + item 311 + item 313) divided by average total assets. Percent Operating Accruals and Percent Total 
Accruals have the same numerators as Operating Accruals and Total Accruals, respectively, but the denominator for each 
is the absolute value of Net Income. Return on Assets is Net Income divided by average total assets. Cash Return on Assets 
is cash from operations divided by average total assets. Total Cash Flow Return on Assets are Net Dividends/Distributions 
plus the change in cash divided by average total assets. Market Value of Equity is measured at the fiscal year-end (item 25 
times item 199). Price (item 199) and Book Value (item 60) are measured at the fiscal yezr-end. Three-year Sales growth 
is the average annual sales growth in the current year and prior two years, Special items (item 17) are coded as 0 if missing. 
Arbitrage Risk is measured as the standard deviation of the residuals from a regression of firm-specific returns on the CRSP 
equally weighted index for up to 48 months preceding the month of return portfolio formation. Amounts are Winsorized at 
1 percent and 99 percent for this table and Table 2 only. 
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Table 2 gives the distribution of the four accruals measures as well as the distributions of 
other firm characteristics across deciles. The first row in Panel А shows that to be in the lowest 
decile of percent operating accruals in the average year, the negative accrual needs to be at least 
4.04 times earnings; to be in the highest decile, the positive accrual needs to be at least 0.82 times 
earnings. Comparing Panel А to the traditional operating accruals in Panel B shows that the types 
of firms that are sorted into the extreme deciles are quite different. The mean market value in 
decile 1 of percent operating accruals is $1,510 million versus $474 million for traditional oper- 
` ating accruals. The firms in decile 1 of percent operating accruals are also performing much better, 
with positive average return on assets and cash return on assets; in contrast, the firms in the first 
decile of traditional operating accruals are among the worst performers in the sample with the 
lowest mean return on assets and cash return on assets. At the other extreme, the firm size and 
performance in the tenth decile are very similar in Panels A and B. As we show in the next table, 
this is because the firms overlap significantly at this end of the two different accrual sorts. Finally, 
Panel A shows that the percent operating accruals measure sorts on cash flows to some degree. 
Because Desai et al. (2004) document a trading strategy based on cash flows and contrast it with 
an accruals trading strategy, in the following sections we examine the overlap between the percent 
accruals measures and cash flow measures. The remaining descriptive statistics will be discussed 
in Section V as part of the exploration into why percent accruals are successful in identifying 
misvalued stocks. 

Panels C and D of Table 2 give the distributions for the total accruals measures. Comparing 
the firms across the distributions of these two measures yields similar conclusions to the findings 
for the operating accrual measures, with one exception: sorting on percent total accruals does not 
sort on the cash return on assets. 

Table 3 gives evidence on the overlap between a sort on percent accruals and a sort on 
traditional accruals or cash from operations. Generally, the overlap between percent operating 
accruals and traditional operating accruals is small in the low deciles but higher in the high 
deciles. Only 11.78 percent of the firms in the first decile of percent operating accruals are in the 
first decile of traditional operating accruals. This increases to 72.95 percent overlap in the tenth 
decile. А more general way to document the overlap between the two sorts is to give the average 
decile rank of one measure in each decile of the other measure. The decile rank of traditional 
operating accruals is 3.49 in the first decile of percent operating accruals (versus 1 if the two sorts 
were identical), and increases only marginally until the sixth decile where the average rank is 4.69. 
The average rank in the tenth decile is 9.70. 

The sort on percent operating accruals produces a weak inverse sort on cash flows, as dis- 
cussed earlier. The mean decile rank of cash from operations is 7.07 in the lowest percent oper- 
ating accrual decile and is 2.44 in the highest decile (versus 10 and 1, respectively, if the two sorts 
were identical). The match between the two sorts is weak, however, with the average cash flow 
rank remaining above five until the ninth decile of percent operating accruals. In Section V we 
show that the returns to a percent operating accruals trading strategy are not driven by its corre- 
lation with cash from operations. 

Table 3, Panel B shows the overlap between percent and traditional total accruals. For total 
accruals, the overlap is a bit stronger for low accruals and a bit weaker for high accruals. There is 
a 23.17 percent overlap in the first decile but only a 46.27 percent overlap in the tenth decile. The 
relevant cash flow comparison for total accruals is the net dividends/distributions plus the change 
in cash (i.e., total cash flows). Column three of Panel B shows that percent total accruals is picking 
out the extremes for total cash flows, with an average rank of 8.60 and 2.03 in the first and tenth 
deciles, respectively, although it does not appear to sort well through the middle of the distribu- 
tion. 
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TABLE 3 
Overlap between Percent Accruals, Traditional Accruals, and Cash Flow Deciles 


Panel А: Percent Operating Accruals 





Traditional Operating Decile Rank of Decile Rank of 
Accrual Decile Matches Traditional Cash from 
Percent Operating Accrual Operating Operations Scaled 
Decile Decile Accruals by Total Assets 
1 11.78% 3.49 7.07 
2 20.38% 3.57 7.05 
3 18.06% 3.66 6.76 
4 14.75% 3.72 5.98 
5 15.14% 3.99 5.54 
6 21.1496 4.69 5.43 
7 35.67% 5.67 5.24 
8 58.33% 7.51 5.11 
9 59,66% 9.03 440 
10 72.95% 9.70 2.44 
Panel B: Percent Total Accruals 
Decile Rank of Net 
Dividends and 
"Traditional Total Accrual Decile Rank of Change in Cash 
Decile Matches Percent Total Traditional Total Scaled by Total 
Decile Accrual Decile Accruals Assets 
1 23.1796 2.36 8.60 
2 30.04% 2.01 6.23 
3 29.23% 2.29 4.71 
4 52.09% 3.57 5.82 
5 49.58% 5.54 6.98 
6 32.44% 6.73 7.09 
7 25.15% 7.46 6.14 
8 23.89% 7.91 4.45 
9 26.65% 8.35 2.99 
10 46.27% 8.80 2.03 


Traditional operating accruals are defined as Net Income (Compustat item 172) less Cash from Operations (item 308) 
divided by average total assets (item 6). Traditional total accruals are computed as Net Income less Net Dividends/ 
Distributions (item 115 + item 127 — item 108) and the change in cash (item 308 + item 311 + item 313) divided by 
average total assets. Percent Operating Accruals and Percent Total Accruals have the same numerators as Operating 
Accruals and Total Accruals, respectively, but the denominator for each is the absolute vale of Net Income. 


To summarize, the lowest deciles of percent accruals are generally composed of different 
firms than the lowest deciles of traditional accruals—the firms for which the trading strategy will 
be long. The lowest decile of percent accruals firms are three times larger and much better 
performing than the lowest decile of traditional accruals, based on either operating or total accru- 
als. The overlap between the percent accruals measures and the traditional accruals measures is 
much higher in the extreme positive deciles, although still far from perfect. 


IV. FUTURE EXCESS RETURNS TO ACCRUAL TRADING STRATEGIES 
In this section we document the return to the percent accruals trading strategy and compare it 
to the traditional accruals strategy and to a cash-from-operations strategy. 
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Excess Return Computation 


Excess returns are computed as buy-and-hold size-adjusted returns, calculated as described in 
Barber et al. (1999). This method starts by constructing ten size portfolios based on December 
market value of equity of all NYSE firms. AMEX and NASDAQ firms are then placed in the 
appropriate size decile using the NYSE breakpoints. Because NYSE firms are typically much 
larger than AMEX and NASDAQ firms, this results in almost half the observations being sorted 
into the smallest portfolio. Barber et al. (1999) correct for this skewness by further subdividing the 
smallest portfolio into four size groups, so that in the end there are 14 size-referent portfolios.” For 
each size portfolio and each month, the returns for each security in the portfolio are compounded 
over the following 12 months and then averaged across all the securities in the portfolio. The 
resulting annual return on each reference portfolio constitutes a passive, equally weighted invest- 
ment in all securities in the portfolio. Excess returns are then the difference between the firm’s 
annual return and the size-matched reference portfolio. For firms that are delisted during the future 
return period, we calculate the remaining return by taking CRSP’s delisting return and then 
reinvesting the proceeds in the equally weighted reference portfolio. For firms delisted due to poor 
performance (delisting codes 500 and 520-584), we use a —35 percent delisting return for NYSE/ 
AMEX firms and a —55 percent delisting return for NASDAQ firms, as recommended in Shum- 
way (1997) and Shumway and Warther (1999). Note that the overall mean excess return is zero by 
construction. For all our results the return window is one year, beginning with the first day of the 
fourth month after the fiscal year-end. 

All significance tests of excess returns are based on Fama-MacBeth t-statistics in order to 
control for the cross-sectional correlation between firms in a given decile for a given year. In 
particular, the 19 annual mean excess returns in a decile are averaged and the t-statistic is based on 
the standard deviation of these 19 observations. 

We take extra care to ensure that the returns we report are the result of an implementable 
trading strategy. For each factor, we use the breakpoints between deciles from the previous year to 
divide the firms in the current year into ten portfolios. For example, for return windows starting in 
1999, we use all observations between August 1998 (four months prior to January 1999) and 
September 1997 to construct a distribution for the factor and identify the ten breakpoints. This 
causes our ten portfolios to have slightly different numbers of observations, but makes it abso- 
lutely certain that the data necessary to sort the firms into portfolios are available when the return 
window opens. 

Kraft et al. (2006) document a look-ahead bias in many accruals studies. Because many 
studies were interested in examining the evolution of accruals as well as the stock returns to an 
accruals-based trading strategy, the sample selection required that the next year’s earnings be 
present. But whether earnings are present in the next year is not known at the time of portfolio 
formation, so the documented returns are not the result of an implementable trading strategy. 
Further, the firms with missing future data are not randomly selected; one reason data might be 
missing in the future is because the firm could have entered bankruptcy. Consistent with this 
conjecture, Kraft et al. (2006) show that, in a sample of NYSE/AMEX firms, the return to the 
low-accrual portfolio is 4.2 percent with the bias present but only 1.8 percent without it. In both 
Kraft et al. (2006) and our study, the excess return in the lowest traditional operating accrual decile 
is no longer significantly different from zero once the look-ahead bias is removed. The magnitude 
of the return in the lowest accrual decile is a particularly important part of the evidence supporting 





5 Our results are very similar if we use only ten size portfolios. We prefer the 14-portfolio approach because it provides 
a closer size match for the smallest firms. Given that many market anomalies are more pronounced for small firms, 
controlling for size is particularly important. 
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the accruals anomaly. Without a significantly positive return to the long position in the lowest 
accrual decile, it is possible that the hedge return no longer exceeds transaction costs, especially 
given the high transaction costs associated with taking a short position in the high-accrual decile. 


Return Results 


Table 4 reports the returns by deciles sorted on percent and traditional operating accruals. 'The 
hedge return (decile 1 less decile 10) to percent operating accruals is 11.68 percent and is signifi- 
cant at p « 0.001. The hedge return comes equally from the long and short position, with an 


TABLE 4 


Mean Annual Size-Adjusted Returns to Percent Operating Accrual and Traditional 
Operating Accrual Portfolios 


Panel А: Percent Operating Accruals 


Size-Adjusted Number of 
Decile Maximum Return p-value Obs. 
1 —4.04 0.0553 «0.001 8,210 
2 -2.04 0.0516 «0.001 8,146 
3 —1.34 0.0550 «0.001 8,139 
4 —0.96 0.0416 «0.001 8,296 
5 —0.67 0.0142 0.416 8,251 
6 —0.42 0.0289 0.279 8,182 
7 —0.18 0.0112 0.667 8,099 
8 0.14 0.0029 >0.500 8,139 
9 0.82 —0.0210 0.226 8,109 
10 —0.0615 0.009 7,955 
Decile 1 — Decile 10 0.1168 «0.001 
Panel B: Traditional Operating Accruals 

Size-Adjusted Number of 

Decile Maximum Return р-уаше Obs. 
1 —0.2023 0.0127 20.500 8,237 
2 —0.1286 0.0673 0.018 8,100 
3 --0.0926 0.0444 0.092 8,197 
4 — 0.0696 0.0403 0.005 8,234 
5 —0.0516 0.0266 0.024 8,040 
6 —0.0343 0.0301 0.004 8,313 
7 —0.0156 0.0167 0.087 8,155 
8 0.0105 —0.0133 0.282 8,156 
9 0.0594 —0.0106 0.490 8,127 
10 —0.0524 0.029 7,973 
Decile 1 — Decile 10 0.0651 0.186 


Returns are the time-series mean annual buy-and-hold size-adjusted returns, calculated as described іп Barber et al. (1999) 
beginning in the fourth month after the fiscal year-end. The p-values are based on two-tailed Fama-MacBeth t-statistics 
computed over the 19 annual mean returns in а decile and the standard deviation of these 19 observations. The breakpoints 
between deciles are based on the previous year's cutoffs, so the number of observations in each decile varies slightly. 
Traditional operating accruals are defined as Net Income (Compustat item 172) less Cash from Operations (item 308) 
divided by average total assets (item 6). Percent operating accruals have the same numerator but scale by the absolute value 
of net income. 
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excess return of 5.53 percent in decile 1 and an excess return of —6.15 percent in decile 10; both 
are significant. In contrast, the hedge return to traditional operating accruals in Panel B is 6.51 
percent, but is insignificant with a p-value of 0.186. Further, the excess return in decile 1 is only 
1.27 percent and is not significant. Consistent with the fact that the overlap between the two 
accrual definitions is high for the top decile, the excess returns in decile 10 of traditional operating 
accruals is —5.24 percent and is significant (p — 0.029), Figure 2 compares the hedge returns to 
percent and traditional operating accrual strategies for each of the 19 years in the sample. The 
hedge based on percent operating accruals is larger in 15 of the 19 years; a simple binomial test is 
significant (p « 0.001). 

Table 5 reports the returns to percent total accruals and traditional total accruals. Panel A 
shows that the hedge return to percent total accruals is 8.53 percent and significant (p — 0.001). 
The long position in decile 1 contributes 5.49 percent of the hedge return, and is significant (p — 
0.005). In contrast, Panel B shows that the hedge return to traditional total accruals is not signifi- 
cant. The excess return to the long position is only 1.60 percent and insignificant, altbough the 
excess return to decile 10 is —4.23 percent and significant (p == 0.048). 

In summary, the percent accruals definitions appear to dominate the traditional accrual defi- 
nitions for both operating and total accruals. Percent accruals generate larger hedge returns, larger 
excess returns in decile 1 where the two definitions differ the most, and more significant results in 
all the extreme deciles. 

The fact that the accrual anomaly is large and significant when measured as percent accruals 
offers researchers an empirical regularity that is difficult to reconcile with an efficient stock 





FIGURE 2 
Annual Size-Adjusted Hedge Returns to Percent and Traditional Operating Accruals (by 
year in which the 12-month return window begins) 
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6 There are 967 observations that are in both the first decile of percent operating accruals and the first decile of traditional 
operating accruals. The mean return for these firms is 7.1 percent, although it is not statistically significant (p = 0.110). 
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TABLE 5 


Mean Annual Size-Adjusted Returns to Percent Total Accrual and Traditional Total 
Accrual Portfolios 


Panel A: Percent Total Accruals 





Size-Adjusted Number of 
Decile Maximum Return p-value Obs. 
1 —1.436 0.0549 0.005 7,461 
2 —0.890 0.0383 0.110 7,400 
3 —0.439 0.0162 70.500 7,290 
4 0.008 0.0387 0.242 7,357 
5 0.350 0.0178 0.240 7,193 
6 0.658 0.0239 0.035 7,290 
7 0.929 0.0076 20.500 7,225 
8 1.291 0.0165 0.332 7,389 
9 2.437 —0.0093 0.289 7,264 
10 70.0304 0.092 7,264 
Decile 1 — Decile 10 0.0853 0.001 
Panel B: Traditional Total Accruals 

Size-Adjusted Number of 
Decile Maximum Return p-value Obs. 
1 —0.177 0.0160 20.500 7,418 
2 —0.073 0.0604 0.133 7,446 
3 —0.027 0.0325 0.028 7,324 
4 0.000 0.0420 0.002 7,270 
5 0.019 0.0139 0.429 7,227 
6 0.037 0.0012 >0.500 7,237 
7 0.060 0.0212 0.236 7,272 
8 0.095 0.0357 0.005 7,306 
9 0.165 —0.0009 0.500 7,341 
10 —0.0423 0.048 7,298 
Decile 1 — Decile 10 0.0583 0.313 


Returns are the time-series mean annual buy-and-hold size-adjusted returns, calculated as described in Barber et al. (1999) 
beginning in the fourth month after the fiscal year-end. The p-values are based on two-tailed Fama-MacBeth t-statistics 
computed over the 19 annual mean returns in a decile and the standard deviation of these 1€ observations. 'The breakpoints 
between deciles are based on the previous year's cutoffs, so the number of observations in each decile varies slightly. 
Traditional total accruals are computed as Net Income less Net Dividends/Distributions (item 115 + item 127 — item 108) 
and the change in cash (item 308 + item 311 + item 313) divided by average total assets, defined as Net Income 
(Compustat item 172) less Cash from Operations (item 308) divided by average total assets (item 6). Percent total accruals 
have the same numerator but scale by the absolute value of net income. 


market. It is also inconsistent with the Kothari et al. (2006) agency hypothesis, insofar as the long 
position contributes a large and statistically significant portion of the excess return. In the next 
section we explore why extreme values of percent accruals identify mispriced stocks. 


V. WHAT DRIVES THE RETURNS TO PERCENT ACCRUALS? 
In this section, we explore the source of the excess returns in the portfolios of extreme percent 
accruals, focusing in particular on the long position in the lowest decile. For brevity, we only 
consider percent operating accruals, but the results and conclusions are similar for percent total 
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accruals. Recall that only 12 percent of the observations in the first decile of the traditional 
operating accruals measure are in the first decile of percent operating accruals. What is the 
difference between the newly constructed accrual portfolios and the traditional ones, particularly 
in the lowest accruals portfolio? We offer one explanation and then eliminate a number of other 
tempting-but-wrong explanations for our return results. 

One reason percent operating accruals identifies misvalued stocks is that this measure suc- 
cessfully sorts observations that are more "extreme" into the extreme portfolios than does the 
traditional operating accruals measure. To understand this, consider the naive and sophisticated 
forecasting models posited in the "earnings fixation" hypothesis. Because the scaling variable is 
what is at issue, we specify each model with unscaled variables and then control for size by 
including common equity (item 60) as another independent variable, as discussed in Barth and 
Kallapur (1996), and we estimate the regression on percentile ranks.’ The naive model is: 


Next Year's Net Income = ap о) * Net Income + а; * Common Equity + в. 


The more sophisticated model distinguishes between cash flows and accruals, resulting in: 


Next Year's Net Income = Ву + 8) * Cash from Operations + 8; * Operating Accruals 


+ Вз * Common Equity + г. 


Noting that net income is identically equal to cash from operations plus operating accruals, the 
expected difference between the predictions of the two models is: 


Difference = (By — ао) + (B, — ол) * Cash from Operations + (8; — ал) * Operating Accruals 


+ (Аз — 05) * Common Equity. 


When the difference is positive, the sophisticated model predicts higher next year's net income 
than the naive model. Under the earnings fixation hypothesis, this approach identifies a stock as 
being undervalued—the market behaves as if it is using the naive model and is subsequently 
surprised when next year's net income is higher than expected. Consistent with Sloan (1996) and 
а host of other studies, we show later that В, > ол > В; that is, cash from operations is more 
persistent than accruals, and the persistence of earnings lies somewhere in between. With this, the 
most extreme positive differences occur when cash from operations is an extreme positive amount 
and operating accruals is an extreme negative amount. These are exactly the observations that 
percent operating accruals sorts into the first decile. The large negative accrual results in a large 
negative numerator and the large positive cash flow of approximately the same magnitude makes 
the denominator small. In contrast, while the first decile of the traditional accruals contains large 
negative accruals, it also contains many firms with large negative cash from operations (as seen in 
Table 2). 

To quantify this effect, Table 6, Panel A reports the coefficients from the naive and sophisti- 
cated models, estimating each regression using the percentile ranks of the variables (as in Sloan 
1996). To partially control for the look-ahead bias inherent in this type of analysis, if the firm is 
missing next year’s earnings, then it is set equal to the negative of this year’s common equity. By 
writing off all the book value, we are effectively assuming the firm was liquidated. The results in 


7 Whether we control for size by including Common Equity in the model has little effect. The coefficient on this variable 
is almost the same in the naive and sophisticated models, and so its effect washes out when computing the difference 
between the models. Consequently, the results in Table 6 are almost identical without this variable. 
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TABLE 6 
Difference between Sophisticated and Naive Forecasts in Different Accrual Portfolios 
Panel А: Percentile Rank Regression of Next Year's Earnings (unscaled) 





Current Cash from Operating Common 
Intercept Earnings Operations Accruals Equity 
Naive model 21.96 0.505 0.051 
p-value <0.001 <0.001 <0.001 
Sophisticated model 8.86 0.575 0.198 0.048 
p-value <0.001 <0.001 <0.001 <0.001 
Panel B: Mean Differences between Sophisticated and Naive Forecasts by Accrual Decile 
Sorted by 
Sorted by Percent 
Traditional Operating 
Decile Operating Accruals Accruals 
Le 4.6 8.6 
2 4.6 5.9 
3 27 43 
4 13 1.3 
5 0.0 -24 
6 —1.1 —3.6 
7 -1.6 —3.1 
8 —0.7 —0.8 
9 —1.9 —0.8 
10* —84 —9.9 


* Significantly different using either a two-sample t-test or a Wilcoxon test at the «0.001 level. 

For each observation we compute the naive forecast and the sophisticated forecast based cn the models in Panel A, and 
then compute the difference in forecasts. Panel B reports the mean difference in forecasts between the two models for each 
decile of accruals, sorted on either traditional operating accruals or percent operating accruals. Earnings is data item 172, 
cash from operations is data item 308, and operating accruals is earnings less cash from operations. Common Equity is data 
item 60. The dependent variable is next year's earnings. All variables are unscaled. Missing values of the dependent 
variable are replaced with the negative value of current common equity (item 60). 





Table 6 are not sensitive to this assumption—simply eliminating the missing observations yields 
similar results. As evident in the table, all coefficients are highly significant, and the relative 
persistence is consistent with prior studies; cash flows are the most persistent, followed by earn- 
ings, then followed by accruals. 

In Table 6, Panel B we quantify exactly how sorting by percent accruals selects more extreme 
differences between the naive and sophisticated models. For each observation, we compute the 
naive forecast, the sophisticated forecast, and the difference between the two forecasts. We then 
report the mean of these differences sorted by percent operating accruals and traditional operating 
accruals. For instance, in decile 1 of traditional operating accruals, the sophisticated forecast is, on 
average, 4.6 percent larger than the naive forecast. By comparison, the глеап difference between 
the forecasts in decile 1 of percent operating accruals is 8.6 percent, almost twice as extreme as the 
difference for traditional operating accruals. Similarly, the difference in forecasts is —8.4 percent 
in the tenth decile of traditional operating accruals, as compared to —9.9 percent for the tenth 
decile of percent operating accruals. The differences in the first decile and tenth decile portfolios 
are significant at the 0.001 level. 
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Bradshaw et al. (2001) provide evidence that analysts fail to fully account for the information 
in accruals in their forecasts. If extreme percent operating accruals are better than traditional 
accruals at identifying observations where there are extreme differences between the naive and 
sophisticated forecasts, then it is possible that they will also do a better job better describing the 
error in analyst forecasts. To examine this possibility, we regress signed analyst forecast errors of 
earnings per share on either traditional operating accruals or percent operating accruals. The 
forecast is the consensus median forecast in the fourth month following the fiscal year-end (the 
same month that the return window opens) and the realization is for the following fiscal year, both 
taken from I/B/E/S. Table 7 reports the results. The accruals measures are converted to their decile 
rank and then scaled to fall in (0, 1]. Consistent with Bradshaw et al. (2001), the coefficient on 
traditional operating accruals is —0.0039 with a t-statistic of —6.1. Analysts systematically over- 
estimate earnings per share for high-accrual firms and underestimate earnings per share for low- 
accrual firms. But more importantly, the relation is stronger when percent operating accruals are 
used, with a coefficient of —0.0049 and a t-statistic of —8.2. 

In sum, by conditioning the accrual on the level of net income, percent operating accruals 
effectively pick out more extreme combinations of cash flows and accruals, resulting in more 
extreme differences in the naive and sophisticated forecast models. Consistent with this pattern, 
analyst forecast errors have a stronger negative association with percent operating accruals than 
with traditional operating accruals. This evidence supports the earnings fixation hypothesis, inso- 
far as the mispricing of accruals is greatest in places where the hypothesis predicts the greatest 
differences in naive and sophisticated beliefs. 


Eliminating Some Alternative Explanations for the Success of Percent Accruals 


In this section, we run through a number of other candidate reasons for the superior perfor- 
mance of the percent operating accruals. 


TABLE 7 
Regression of Analyst Forecast Errors on Traditional Operating Accruals and Percent 
Operating Accruals 
Decile Rank of Decile Rank of 
Traditional Operating Percent Operating 
Intercept Accrual Accrual 
Model 1 —0.0107 --0.0039 
t-statistic —24.2 —6.1 
p-value «0.001 «0.001 
Model 2 —0.0102 — 0.0049 
t-statistic —25.0 -8.2 
p-value «0.001 «0.001 


The forecast error is measured as the actual earnings per share less the median forecast in the fourth month following the 
fiscal year-end (і.е., the first forecast after the portfolio formation month), taken from I/B/E/S. Traditional operating 
accruals are defined as Net Income (Compustat item 172) less Cash from Operations (item 308) divided by average total 
assets (item 6). Percent operating accruals have the same numerator but scale by the absolute value of net income. The 
independent variables are ranked into deciles, then scaled to be in [0, 1]. The sample size is 40,352. 
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Percent Accruals Proxies for Cash from Operations 

Desai et al. (2004) document a powerful trading strategy based on the ratio of cash from 
operations to price, and argue that accruals are simply a proxy for this variable, given the strong 
negative correlation between accruals and cash from operations. They scal2 by price to capture the 
well-known value-glamour anomaly (Lakonishok et al. 1994). The value-glamour effect is outside 
our scope, but in Table 9 of their article, they regress future returns on cash from operations scaled 
by total assets and accruals scaled by total assets. They find that accruals are no longer signifi- 
cantly related to future returns, but cash from operations remains highly significant. Yt is therefore 
possible that percent operating accruals proxy for a more fundamental cash-from-operations 
anomaly. To examine this possibility, we first document the hedge return to cash from operations 
scaled by total assets, shown in Table 8. We do not find a significant hedge return, but the return 
to the long position in the high cash-from-operations portfolio is 6.62 percent and significant at 
less than 0.001 level. In Table 9, we estimate regressions of next year's returns on the decile rank 
of percent operating accruals and cash from operations, where the rank is scaled to fall in (0, 11. 
In model 1, the coefficient on percent operating accruals is —0.0993, implying that the expected 
hedge between the extreme deciles is 9.93 percent, and is significant (p — 0.001). In contrast, the 
coefficient on cash from operations is insignificant in model 2. And finally, when both variables 
are included in the regression in model 3, the estimated return and significance of percent oper- 
ating accruals falls slightly, but is still significant, while cash from operations remains insignifi- 
cant. In sum, sorting on percent operating accruals weakly sorts on cash from operations, as shown 
in Table 2, but the association with cash flows is not the driving force behind the success of 
percent operating accruals. 


TABLE 8 
Mean Annual Size-Adjusted Returns to Cash from Operations Scaled by Total Assets 

Portfolios 

Size-Adjusted Number 
Decile Maximum Return p-value of Obs. 
1 -0.1522 —0.0496 0.474 8,240 
2 —0.0362 0.0030 20.500 8,043 
3 0.0139 0.0026 20.502 8,041 
4 0.0442 0.0145 0.448 8,040 
5 0.0681 0.0273 0.049 8,276 
6 0.0900 0.0273 0.024 8,092 
7 0.1140 0.0378 0.003 8,150 
8 0.1435 0.0350 0.005 8,132 
9 0.1919 0.0307 0.032 8,251 
10 0.0662 «0.001 8,267 
Decile 10 — Decile 1 0.1158 0.113 


Returns are the time-series mean annual buy-and-hold size-adjusted returns, calculated as described in Barber et al. (1999) 
beginning in the fourth month after the fiscal year-end. The p-values are based on two-tailed Fama-MacBeth t-statistics 
computed over the 19 annual mean returns in a decile and the standard deviation of these 19 observations. The breakpoints 
between deciles are based on the previous year's cutoffs, so the number of observations in each decile varies slightly. 
Portfolios are sorted by Cash from Operations (item 308) divided by average total assets (сеш 6). 
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TABLE 9 


Regressions of Annual Size-Adjusted Returns on Percent Operating Accruals and Cash 
from Operations 


Decile Rank of Decile Rank of Cash 
Percent Operating from Operations 

Intercept Accrual Scaled by Total Assets 
Model 1 0.0674 —0.0993 
p-value «0.001 0.001 
Model 2 —0.0212 0.0807 
p-value 0.624 0.218 
Model 3 0.0346 —0.0784 0.0474 
p-value 0.483 «0.001 0.485 


Returns are the time-series mean annual buy-and-hold size-adjusted returns, calculated as described in Barber et al. (1999) 
beginning in the fourth month after the fiscal year-end. The p-values are based on two-tailed Fama-MacBeth t-statistics 
computed over the 19 annual mean returns in a decile and the standard deviation of these 19 observations. The breakpoints 
between deciles are based on the previous year's cutoffs, so the number of observations in each decile varies slightly. 
Percent operating accruals are defined as Net Income (Compustat item 172) less Cash from Operations (item 308), divided 
by the absolute value of Net Income. The independent variables are ranked into deciles then scaled to be in [0, 1]. 


Special Items 

A common reason a firm has large, negative, unscaled accruals is because it records a non- 
cash special item, such as an asset write-off, in the period. Dechow and Ge (2006) study this 
explanation for the traditional accruals measure, reporting that the positive returns in the low- 
accrual decile in their sample is driven by firms with large negative special items. Indeed, Table 2, 
Panel B shows that 55.00 percent of the firms in the first decile of traditional accruals have 
negative special items. In contrast, only 22.83 percent of the firms in the lowest decile of percent 
accruals have a negative special item. In Table 10, we partition each decile into observations with 
or without a negative special and examine the returns within each partition. 

Table 10, Panel А shows that for both subsamples the hedge return to percent accruals is large 
and significant; it is 11.88 percent when there is no negative special item and 9.94 percent when 
a special item is present. Further, the excess return in the first decile is approximately the same 
size regardless of the special item. Table 10, Panel B shows results consistent with the unparti- 
tioned results for traditional operating accruals; the hedge return is insignificant in both sub- 
samples, as are the returns in the first decile. In sum, tbe returns to a trading strategy based on 
percent operating accruals are not particularly sensitive to the presence or absence of a negative 
special item. 


Gains versus Losses 
Table 11 bisects the data based on whether the firm reported positive or negative net income. 
When the data are sorted based on traditional operating accruals, the results are very similar to 


We consider two alternative partitions of the sample: (1) the definition in Dechow and Ge (2006) requires that the 
negative special item exceeds 2 percent of total assets, selecting about 20 percent of the sample, and (2) a definition in 
the spirit of percent accruals is that the negative special item exceed 25 percent of absolute net income, which also 
selects about 20 percent of the sample. We prefer using the (unscaled) existence of a negative special item because it 
yields the same partition regardless of the scale variable used. The conclusions in Table 10 are qualitatively similar using 
either of these alternative definitions. 
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TABLE 10 
Mean Annual Size-Adjusted Returns for Operating Accruals by Special Item Subsamples 


Panel À: Percent Operating Accruals 


No Negative Special Item (64%) Special Item « 0 (36%) 

% of % of 
Decile Return p-value sample Return p-value sample 
1 0.0511 0.003 5.5% 0.0604 0.002 4.5% 
2 0.0414 0.004 5.7% 0.0698 0.002 4.3% 
3 0.0449 0.005 5,7% 0.0745 0.001 43% 
4 0.0446 0.007 5.6% 0.0355 0.076 4.5% 
5 0.0112 0.375 5.8% 0.0020 20.500 43% 
6 0.0021 >0.500 6.5% 0.0532 0.354 3.5% 
7 0.0064 >0.500 6.9% 0.0194 >0.500 3.196 
8 —0.0012 20.500 7.4% —0.0116 20.500 2.6% 
9 —0.0272 0.077 7.6% 0.0078 >0.500 2.4% 
10 —0.0678 0.019 7.4% —0.0390 0.171 2.3% 
Di — D10 0.1188 <0.001 0.0994 0.035 
Panel B: Traditional Operating Accrual 
No Negative Special Item (64%) Special Item < 0 (36%) 

% of % of 
Decile Return p-value sample Return p-value sample 
1 —0.0446 0.125 3.896 0.0416 0.466 6.396 
2 0.0581 0.006 5.2% 0.0742 0.053 4.796 
3 0.0331 0.153 6.0% 0.0675 0.082 4.1% 
4 0.0349 0.000 6.4% 0.0553 0.165 3.7% 
5 0.0260 0.023 6.5% 0.0238 >0.500 3.4% 
6 0.0244 0.028 6.9% 0.0890 0.042 3.2% 
7 0.0158 0.123 7.0% 0.0183 >0.500 3.0% 
8 —0.0134 0.301 7.3% 0.0192 >0.500 2.7% 
9 —0.0095 >0.500 7.4% --0.0369 >0.500 2.6% 
10 - 0.0569 0.028 7.7% 0.0254 >0.500 2.1% 
D1 - D10 0.0123 >0.500 0.0162 0.197 


Returns are the time-series mean annual buy-and-hold size-adjusted returns, calculated as described in Barber et al. (1999) 
beginning in the fourth month after the fiscal year-end. The p-values are based on two-tailed Fama-MacBeth t-statistics 
computed over the 19 annual mean returns in a decile and the standard deviation of these 19 observations. The breakpoints 
between deciles are based on the previous year's cutoffs, so the number of observations in each decile varies slightly. 
Traditional operating accruals are defined as Net Income (Compustat item 172) less Cash from Operations (item 308) 
divided by average total assets (item 6). Percent operating accruals have the same numerator but scale by the absolute value 
of net income. Special Items (item 17) are coded as 0 if missing. 


those reported by Dopuch et al. (2009), with no significant returns for any decile in the loss 
subsample, as shown in Panel B. This is a particularly troublesome result for the traditional 
accrual anomaly, as 34 percent of the observations in the population have losses, and there is 
nothing in the naive investor hypothesis that says the result should orly hold for firms with 
positive net income. However, when the data are sorted based on percent operating accruals, Panel 
A of Table 11 shows that the hedge return and the excess return for the lowest decile are significant 
in both subsamples, and actually somewhat larger and more significant in the loss subsample than 
in the gain subsample. 
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TABLE 11 
Mean Annual Size-Adjusted Returns for Operating Accruals by Gain/Loss Subsamples 


Panel А: Percent Operating Accruals 
Loss (34%) Gain (66%) 


А —— % of м % of 
Decile Return p-value Sample Return p-value Sample 
1 0.0715 «0.001 3.7% 0.0459 <0.001 6.4% 
2 0.0487 0.032 3.7% 0.0533 0.001 6.3% 
3 0.0712 0.031 3.8% 0.0408 0.042 6.2% 
4 0.0435 0.131 44% 0.0379 0.024 5.9% 
5 -0.0146 >0.500 4.2% 0.0204 0.224 5.9% 
6 0.0143 >0.500 3.8% 0.0233 0.110 6.3% 
7 —0.0080 >0.500 3.7% 0.0210 0.075 6.3% 
8 —0.0145 20.500 3.3% 0.0105 20.500 6.7% 
9 —0.0740 0.101 2.196 —0.0193 0.146 7.996 
10 —0.0636 0.155 1.9% —0.0634 0.002 7.9% 
Di — D10 0.1351 0.007 0.1093 <0.001 
Panel B: Traditional Operating Accruals 

Loss (34%) % of Gain (66%) % of 

Decile Return p-value sample Return p-value sample 
1 —0.0021 20.500 8.6% 0.0619 0.128 1.3% 
2 0.0539 0.195 5.8% 0.0743 <0.001 4.1% 
3 0.0253 >0.500 4.1% 0.0482 0.025 6.0% 
4 0.0350 0.361 3.1% 0.0374 0.002 7.0% 
5 0.0075 >0.500 2.5% 0.0290 0.137 7.3% 
6 0.0506 0.258 2.2% 0.0237 0.132 8.0% 
7 0.0357 0.458 2.0% 0.0116 0411 8.0% 
8 —0.0414 0.259 2.0% —0.0061 >0.500 8.0% 
9 —0.0359 0.496 2.0% —0.0069 20.500 8.0% 
10 —0.0468 0.384 1.9% —0.0578 0.004 7.8% 
D1 — D10 0.0447 70.500 0.1197 0.009 


Returns are the time-series mean annual buy-and-hold size-adjusted returns, calculated as described in Barber et al. (1999) 
beginning in the fourth month after the fiscal year-end. The p-values are based on two-tailed Fama-MacBeth t-statistics 
computed over the 19 annual mean returns in a decile and the standard deviation of these 19 observations. The breakpoints 
between deciles are based on the previous year's cutoffs, so the number of observations in each decile varies slightly. 
Traditional operating accruals are defined as Net Income (Compustat item 172) less Cash from Operations (item 308) 
divided by average total assets (item 6). Percent operating accruals have the same numerator but scale by the absolute value 
of net income. Loss or Gain is based on the sign of item 172. 


Partitioning the percent accruals strategy on gain versus loss offers an interesting connection 
to Burgstahler and Dichev (1997). In this study, firms with small gains are considered more likely 
to be manipulating earnings up than firms with small losses. As previously discussed, the extreme 
deciles of percent accruals are firms with relatively small positive or small negative earnings 
(making the denominator close to zero). Consistent with the earnings manipulation hypothesis, 
Table 11, Panel A shows that firms that are in the first decile of percent accruals, and accrue down 
into a loss position, have higher returns in the subsequent year, as compared to those that accrued 
down, but into a small gain position. The argument is that their accruals are less suspicious 
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because they chose to accrue into a small loss. At the other extreme, the earnings manipulation 
hypothesis suggests that firms in the top decile of percent accruals that accrue up to a small gain 
are more “manipulative” than those that accrue up to a small loss. The results for this case are not 
as clear; the returns in the top decile of percent accruals are very similar regardless of the sign of 
the firm's earnings. However, the significance level of the negative subsequent returns is superior 
for firms tbat accrued up to a small positive gain, as compared to those that accrued up to a small 
loss. Without overstating the significance of these results, there is some evidence that extreme 
percent accruals combined with the sign of earnings might flag firms that have manipulated 
earnings in a way that results in predicable market returns in the subsequent year. 


Limits to Arbitrage 

Mashruwala et al. (2006) find that the accrual anomaly, based on traditional operating accru- 
als, is concentrated in firms with high arbitrage risk, as measured by the idiosyncratic volatility of 
their stock returns. They also find that it is concentrated in firms with lower share prices, which is 
a common proxy for transaction costs. They suggest that the combination of risk imposed on 
arbitrageurs and high transaction costs might leave the accrual anomaly difficult to exploit. To 
compare the two measures of accruals' exposure to arbitrage risk, we examine the returns to each 
strategy for each quintile of arbitrage risk. Following Mashruwala et al. (2006), we estimate 
arbitrage risk as the standard deviation of residuals from a regression of the firm-level returns on 
the CRSP equally weighted index for up to 48 months preceding the month of return portfolio 
formation.” To keep the portfolio sizes roughly comparable to previous results, we sort percent 
accruals and traditional accruals into five quintiles. The resulting two-wav sort is given in Table 
12, and shown graphically in Figure 3. 

| As seen in Table 12, Panel A, for every quintile of arbitrage risk, the hedge return to percent 
operating accruals dominates the return to traditional operating accruals. More importantly, the 
return to percent accruals exceeds 10 percent іп the top three quintiles, or over half the sample, 
‚‚ and is significant in all but the lowest quintile. Figure 3, Panel B shows the size-adjusted returns 
“to the long position of each strategy across quintiles of arbitrage risk. The returns to the long 
position for both strategies is essentially zero for the quintile with the lowest arbitrage risk, but for 
the top three quintiles the percent accruals strategy dominates the traditional accrual strategy, and 
yields size-adjusted returns of 7 percent, 11 percent, and 8 percent. Table 12, Panel A shows that 
the long-only returns are significant in the top three quintiles of arbitraze risk, and the hedge 
returns are significant in the top four quintiles of arbitrage risk. In sum, the results in Figure 3 and 
Table 12 show that, while percent accruals are not completely insensitive to the level of arbitrage 
risk, they continue to be large and significant for a much larger portior. of the distribution of 
arbitrage risk than traditional accruals. 

With respect to transaction costs, Panels A and B of Table 2 shows that the firms in the lowest 
decile of percent operating accruals are considerably larger than the firms in the lowest decile of 
regular operating accruals. They also have considerably higher share prices. The mean share price 
in decile 1 for traditional operating accruals is $7.14 per share, while it 2s $13.92 per share for 
percent operating accruals, almost twice as large. Larger firms and higher share prices have been 
shown to proxy for lower transaction costs (e.g., Bartov et al. 2000). It is therefore unlikely that 
greater transaction costs are allowing the higher excess returns to the first decile of percent 
operating accruals to remain unexploited. 


? Mashruwala et al. (2006) require that the firm have 48 prior months of returns to be included in the sample. We do not 
impose this data requirement because it is not generally a requirement imposed on studies of the accrual anomaly. 
Further, as they report, the returns to an accrual strategy are significantly greater when this data requirement is not 
imposed. Finally, while estimates based on fewer than 48 observations will be noisier, they will still be unbiased. 
Requiring 48 months of prior returns, on the other hand, could inadvertently introduce a 5185. 
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FIGURE 3 
Annual Size-Adjusted Returns to Percent Operating Accruals by Quintile of Arbitrage Risk 


Panel А: Return to Hedge between Top and Bottom Percent Operating 
Accrual Quintiles 
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Some other candidate explanations for the success of percent operating accruals are as fol- 
lows: 


Greater Association with Growth? The association between accrual measures and “growth” 
depends on how growth is measured; if it is measured as the increase in operating assets, for 
instance, then operating accruals and growth are almost synonymous. To assess whether 
percent accruals are proxying for an underlying “growth” anomaly, without imposing a me- 
chanical relation between the two measures, we compute the average annual sales growth rate 
over the past three years, ending in the year in which accruals are measured. We use this 
definition of growth because it is not directly based on changes in operating assets in the same 
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period, and because it has been used in prior studies as a measure of “growth” or “glamour” 
(see Desai et al. 2004, for example). Table 2, Panel A shows how growth varies across percent 
operating accrual deciles. The table shows only a mild increase in growth across deciles. In 
the extreme deciles, growth is lower in the first and tenth deciles of percent operating accruals 
than it is in the first and tenth deciles of traditional operating accruals. As an alternative 
growth measure, we compute that current year's sales growth. For our sample, the Spearman 
correlation between the current year's sales growth and either traditional or percent operating 
accruals is 0.19 and, like the past three-year growth rate, is lower in the extreme deciles of 
percent accruals than in the extreme deciles of traditional accruals. In sum, while percent 
operating accruals is weakly correlated with sales growth, it is no more correlated than 
traditional operating accruals. 

Faster Mean Reversion in Accruals? Earlier we reported that the differences between the 
naive and sophisticated forecasting models are larger in the extreme deciles of percent oper- 
ating accruals than in the extreme deciles of traditional operating accruals. This pattern does 
not lead to a more rapid mean reversion in accruals, however. Forty-one percent of the firms 
in the first two deciles of traditional operating accruals remain in the first two deciles in the 
following year. For percent operating accruals, 43 percent remain in the first two deciles in the 
following year. Thus, while the percent operating accruals measure picks out more extreme 
differences in forecasts between the naive and sophisticated models, the accruals for these 
observations do not reverse any faster than for the traditional operating accruals measure. 
Greater Capital Intensity? Another possibility is that, by not scaling by total assets, the 
percent operating accrual measure is picking up more capital-intensive firms. This is not the 
case, however. The lowest decile of traditional operating accruals has a ratio of depreciation/ 
amortization to average total assets of 0.1261, while the same measure is only 0.0684 for the 
lowest decile of percent operating accruals. 

More Extreme "Outlier" Returns? Kraft et al. (2006) show that a large part of the positive 
excess return in the low-accrual portfolio is due to a few firms with extreme positive returns. 
The 99th percentile of excess returns in the first decile of traditional operating accruals is 
5.23, but is only 2.87 in the first decile of percent operating accruals. Further, the mean of the 
top five excess returns is almost twice as high in the first decile of traditional operating 
accruals than it is in the first decile of percent operating accruals. While the distribution of 
returns is clearly skewed in both samples, the percent operating accrual measure is less 
sensitive to extreme returns than is the traditional measure. Incidentally, of the 83 most 
extreme excess returns in the top percentile of the first decile of traditional operating accruals 
and percent operating accruals, only 4 observations are common between the two portfolios. 


VI. CONCLUSION 

Accruals are the main work product of financial accounting. It is therefore vitally important 
that we understand how accruals relate to equity valuation. This exploration goes beyond attempts 
to refine an investment strategy and produce ever-bigger excess returns. By documenting another 
means of using accrual information to identify misvalued stocks, we learn more about what the 
market does and does not understand. 

We offer a subtle innovation. By redefining accruals to be relative to income rather than to 
total assets—that is, percent accruals—we offer a new measure of accruals that selects radically 
different firms and produces excess returns in subsamples of the population for which the tradi- 
tional accruals measure performs poorly. Our results are consistent with the earnings fixation 
hypothesis insofar as extreme values of percent accruals correspond with cases in which the 


The Accounting Review January 2011 
American Accounting Association 





Percent Accruals 235 


difference between a naive and sophisticated forecasting model is the most pronounced. As the 
literature continues to investigate the underlying explanation for the accruals anomaly, our results 
provide additional regularities that the theory must accommodate. 
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ABSTRACT: This study addresses whether firms' share prices correctly reflect two 
accounting measures: dirty surplus and really dirty surplus. Dirty surplus is readily 
observable from the financial statements, but really dirty surplus, which arises from 
recognizing equity transactions such as employee stock option exercises at other than 
fair market value, is not. Findings show that dirty surplus and really dirty surplus are 
irrelevant for forecasting abnormal comprehensive income. However, findings also in- 
dicate that investors appear to undervalue really dirty surplus. Hedge returns are insig- 
nificant when portfolios are formed based on dirty surplus, but are significantly positive 
based on really dirty surplus. Really dirty surplus positive hedge returns are robust to a 
variety of sensitivity tests. Taken together, the findings are consistent with either inves- 
tors over-valuing firms that have large negative really dirty surplus or really dirty surplus 
being correlated with an unmodeled risk factor. 
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Е INTRODUCTION 

substantial and growing literature considers whether investors properly assess the char- 
А === of earnings and its components when setting stock prices. The question we 

address is whether firms' share prices correctly reflect two accounting measures that have 
received relatively little attention to date. The first of these is commonly referred to as “dirty 
surplus," which is a component of comprehensive income that is excluded from reported earnings, 
and therefore violates clean surplus accounting. We label the second accounting measure we 
consider "really dirty surplus," which arises when a firm issues or reacquires its own shares in a 
transaction that does not record the shares at fair market value. Examples of this kind of transac- 
tion are shares issued in a stock option exercise and a conversion of a bond into common stock. 
Prior to the implementation of FASB Statement No. 141, the pooling-of-interests method of 
accounting for business combinations could also result in substantial reallv dirty surplus. If inves- 
tors fully understand the predictive value of these accounting amounts, then it should not be 
possible to develop a profitable trading strategy based on the magnitudes of these items. 

Unlike dirty surplus, which is readily observable from the financial statements, really dirty 
surplus is unobservable. That is, even the most sophisticated investor cannot estimate readily the 
valuation impact of equity transactions that give rise to really dirty surplus because equity trans- 
actions are recognized in the financial statements using an accounting-based rather than a market- 
based measure of the value of equity. The estimation task investors face is exacerbated by the fact 
that a firm can engage in numerous such transactions throughout the year. As a result, really dirty 
surplus transactions are less likely to be correctly priced than are dirty surplus transactions. 

Using financial statement and stock price data from 1976-2006, we first assess whether 
investors properly value each of these two components of earnings by estimating a residual 
income forecasting equation and an attendant valuation equation that includes both of these com- 
ponents. If current residual income is sufficient for forecasting the next period's residual income 
and current residual income and equity book value are sufficient for valuing current equity, then 
the forecasting and valuation coefficients on the income components of interest will be linked in 
a predictable manner. Finding a mismatch between the components' forecasting and valuation 
equation coefficients would be consistent with investors! mispricing of the components. 

We also conduct hedge portfolio returns tests. We adopt a buy-and-hold strategy to go long in 
firms with relatively large dirty or really dirty surplus and to go short in firms with relatively small 
dirty or really dirty surplus. We conjecture that small firms' prices are less likely to reflect fully all 
publicly available information because investors incur proportionately greater transaction costs; as 
a result, they are less closely followed and less likely to be subject to detailed accounting analysis. 
We therefore conduct both sets of tests separately for small, medium, and large firms to assess 
whether pricing effects are related to firm size. 

We find that both dirty surplus and really dirty surplus are irrelevant for forecasting abnormal 
very comprehensive income for all three firm-size groups. Taking these results at face value, if 
investors correctly understand the implications of these persistence findings for valuation, then 
each kind of dirty surplus sbould be irrelevant for valuation for all firms. This prediction is borne 
out in the case of dirty surplus. However, the findings indicate that investors appear to undervalue 
really dirty surplus, which is consistent with investors being unable to assess the economic im- 
plications of really dirty surplus transactions. 

Buy-and-hold hedge return results support the findings from the tests linking the forecasting 
and valuation equations. As expected, hedge returns are insignificantly different from zero when 
based upon dirty surplus, regardless of firm size and investing horizon. In contrast, the hedge 
returns based on really dirty surplus are significantly positive for all three firm-size groups. We 
also consider an alternative to our buy-and-hold procedure for computing hedge returns. Findings 
based on mean returns for monthly calendar-time hedge portfolios indicate that significantly posi- 
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tive hedge returns are concentrated within small firms. Findings from additional tests reveal that 
inferences relating to hedge returns are insensitive to including controls for four previously iden- 
tified mispricing anomalies, and to sampling procedures designed to attempt to focus on sources of 
really dirty surplus. 

Taken together, the findings are consistent with investors over-valuing firms that have large 
negative really dirty surplus. However, several cautionary notes are in order. First, although the 
hedge returns findings are consistent with mispricing of really dirty surplus, the possibility remains 
that the mismatch of the really dirty surplus forecasting and valuation coefficients is the result of 
model misspecification rather than mispricing. Second, as is likely the case with investors, we are 
unable to trace the sources of really dirty surplus to particular types of equity transactions. As a 
result, we cannot determine the extent to which potential mispricing arises from each type of 
transaction, i.e., our findings can only be interpreted as reflecting the aggregate effect of the 
various types of transactions. However, even if we could trace the sources of really dirty surplus, 
any resulting hedge returns might still be attributable to an unmodeled risk factor. 

Our study adds to prior research finding evidence of investors’ apparent failure to link the 
forecasting attributes of accounting amounts with the pricing implications (e.g., Bernard and 
Thomas 1989; Sloan 1996; Barth et al. 1999; Bradshaw and Sloan 2002; Burgstahler et al. 2002; 
Brown and Sivakumar 2003; Doyle et al. 2003; Landsman et al. 2007). Our findings support prior 
studies that find that investors understand the forecasting properties and valuation implications of 
dirty surplus (Dhaliwal et al. 1999; O'Hanlon and Pope 1999; Biddle and Choi 2006; Chambers et 
al. 2007). Although Landsman et al. (2006) examine valuation implications of expected future 
equity transactions arising from the exercise of employee stock options, their study does not 
address whether investors take full account of the valuation implications of past option exercises. 
Core et al. (2002) report findings suggesting that dilutive transactions, including those arising from 
employee stock option grants, are poorly dealt with in reported diluted earnings per share, leaving 
open the possibility that investors may have difficulty in valuing such transactions. 

Section П provides the motivation for the study and explains how dirty surplus and really 
dirty surplus are defined. Section III presents the research design, including computation of dirty 
surplus and really dirty surplus, development of the forecasting and valuation equations, and 
description of our hedge return strategy. Section IV describes the sample and data, and Section V 
presents the findings. Section VI summarizes and concludes. 


II. MOTIVATION 

The empirical issue that is central to this research is whether firms' share prices correctly 
reflect two accounting measures that have received relatively little attention to date. The first of 
these is commonly referred to as dirty surplus (DS), which is a component of comprehensive 
income that is excluded from reported earnings and therefore violates clean surplus accounting 
(Ohlson 1995; Feltham and Ohlson 1995). Dirty surplus accounting results in the basic residual 
income valuation model yielding an inaccurate estimate of equity value because the sum of current 
book value and future net incomes does not equal the sum of future net dividends. 

The second accounting measure we consider we label really dirty surplus (RDS), which arises 
when a firm issues or reacquires its own shares in a transaction that does not record the shares at 
fair market value. The primary sources of RDS include employee stock option exercises, conver- 
sion of preferred stock and bonds, and mergers accounted for as pooling-of-interests. Whereas 
equity issued under employee stock option exercises or convertible instruments can give rise to 
unrecorded expenses, equity issued under pooling-of-interests gives rise to an unrecorded asset. 


1 Equity issued under the pooling-of-interests method is not recognized at fair market value. In contrast, if purchase 
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RDS violates the super-clean surplus concept (Feltham 1996; Christensen and Feltham 2003), 
under which it is assumed that share issuances are recorded at fair market value. When this 
condition is violated, the discounted present value of future net dividends (or equivalently, the sum 
of equity book value and discounted present value of future abnormal earnings) will not equal the 
market value of equity relating to current shares outstanding, but rather will equal the market value 
of equity relating to current shares outstanding plus the market value of other equity claimants. 
Because the equity transactions that give rise to RDS generally are recorded іп the financial 
statements at less than market value, RDS is generally negative.” 

DS is readily observable from the financial statements. When one takes a clean surplus 
accounting perspective, as comprehensive income does, dirty surplus becomes a component of 
earnings. Dirty surplus is conventionally defined as the sum of recognized revenue or expense 
items that bypass the income statement. Unlike DS, RDS is not reported in the financial statements. 
А super-clean surplus accounting perspective requires that both dirty surplus and really dirty 
surplus become components of earnings so that the discounted stream of future residual incomes 
and current equity book sums to the market value of equity of current shareholders. If investors 
fully understand the implications of D$ and RDS for valuation, then it should not be possible to 
develop a profitable trading strategy based on the magnitudes of these items. 

То see these points more clearly, consider first the following version of clean surplus account- 
ing: 


ВУЕ, = BVE, + X,+ DS, – Div, + PA(N, — N, |), (1) 


where BVE, is defined as ending equity book value, X, represents net income, DS, is dirty surplus, 
Div, is dividends, №, is the number of shares outstanding at the end of period t, and PÅ is the price 
per share used to record the issuance or reacquisition of equity shares in the accounting system. 
Note that if DS, is zero, then the accounting is said to satisfy clean surplus accounting. 

Let РИ be the market price per share at the date of issuance or reacquisition of equity shares. 
We define really dirty surplus, RDS,, by: 


RDS, = (N, - NP – Ру). (2) 
By combining Equation (1) and Equation (2), we arrive at: 
DS, + RDS, = BVE, – BVE,., — X, + Div, РИ(М,- N, 4). (3) 


DS, BVE, BVE, 1, Хр and Div, are readily observable in the financial statements. The final term 
on the right-hand side of Equation (3), РУ(М, — М, 1), is not reported in the financial statements 
and therefore needs to be estimated. 

Note that if both DS, and RDS, are zero in Equation (3), then the accounting is said to satisfy 
super-clean surplus accounting. The next section allows for both nonzero DS, and RDS, and 
super-clean surplus accounting by setting VCNI, = X, + DS, + RDS, where VCNI is “very compre- 
hensive” net income. Our definition of RDS, and hence VCNI, attributes all of the violation of 
super-clean surplus accounting to the period during which the equity trarsaction is recorded at a 
price other than fair market value. Christensen and Feltham (2003) show that when super-clean 
surplus accounting holds in periods subsequent to time +, application of the residual income 


accounting was applied instead of the pooling-of-interests method, RDS would not arise; instead, the amount of RDS 
attributed to pooling-of-interests would be recognized іп the financial statements as gooówill. 

It is possible that RDS could be positive. For example, consider a bond with a book value of $175 and fair value of $125 
that is converted into equity whose fair value is $150. In this case, RDS would be a positive amount equal to $25. 
Likewise, unrecorded goodwill associated with a merger accounted for under pooling-of-interests could give rise to 
negative unrecorded goodwill and, hence, positive RDS. 
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valuation model will yield an estimate of equity value that equals the market value of equity of 
existing shares. Whether super-clean surplus accounting holds up to and including period t simply 
affects the opening balance of equity book value at time 2. However, if РМ and P^ differ for 
transactions in periods subsequent to time f, then super-clean surplus accounting will be violated 
and, hence, the residual income valuation model will not yield an estimate of equity value that 
equals the market value of existing shares. 

As stated above, the empirical issue that is central to this research is whether firms’ price per 
share correctly reflects DS and RDS. If it does, then one should not be able to develop a trading 
strategy based on DS or RDS that generates future abnormal returns. There are several reasons 
why we expect that RDS is the better earnings component on which to base a trading strategy. 
First, as noted above, unlike DS, RDS is not reported in the financial statements. Second, RDS 
appears to be inherently complex. For example, for most earnings components, any “overstate- 
ment” or “understatement” reverses in future periods; this does not hold for RDS. Third, research 
on DS (Dhaliwal et al. 1999; O’Hanlon and Pope 1999; Biddle and Choi 2006; Chambers et al. 
2007) is not especially encouraging about the possibility that it can be used to construct a profit- 
able trading strategy. 

Nonetheless, there are at least two compelling reasons for conducting our tests for DS as well 
as RDS. First, our study is the first to examine whether investors properly price DS based on the 
forecasting and valuation equations in the Ohlson (1999) model as well as on hedge return tests. 
Second, because we do not necessarily expect to find evidence of mispricing relating to DS, 
finding this is the case mitigates concerns that finding evidence of mispricing relating to RDS is 
attributable to misspecification of our empirical procedures. 


HI. RESEARCH DESIGN 
Computation of DS and RDS 


Following Dhaliwal et al. (1999) and Chambers et al. (2007), we compute DS as the sum of 
(1) the change in the balance of unrealized gains or losses on marketable securities (change in 
Compustat #238), (2) the change in the cumulative foreign exchange adjustment (change іп Com- 
pustat #230), and (3) 0.65 times the change in additional pension liability in excess of unrecog- 
nized prior service costs (change in Min [(Compustat #297 — #298), 0)). 

Based on Equation (3), we compute RDS as the change in the book value of common equity 
(Compustat #60 + #227 - #242), less DS, less net income (Compustat #172 — #19), plus 
dividends (Compustat #21), less share price at middle of fiscal year times change in common 
shares outstanding (Compustat #25, adjusted for stock dividends and splits). Note that because we 
(and investors) cannot readily compute using the individual underlying equity transactions, RDS 
likely measures the true underlying construct with error? Share prices’ are from the CRSP data- 
base. 


Landsman et al. (2006) show that, in the case of employee stock options (1.е., contingent equity), when only clean 
surplus holds, the estimate of equity value equals the sum of the market value of existing shares and employee stock 
options. The study's model considers the case in which employee stock options are granted at time f or earlier. The 
residual income valuation model does not yield an estimate of the value of existing shares because the options are not 
yet exercised, and when they are exercised in the future, the new shares will be recognized ас P^ rather than РМ, 

To the extent that our definition of DS does not include all dirty surplus items (e.g., the cumulative effects on equity of 
retrospective accounting changes), DS will be measured with error. Because RDS is net of DS, such items will appear 
as part of our measure of RDS. Also, our measure of RDS includes treasury stock transactions taking place at prices that 
differ from the market price at the middle of the fiscal year. 

In particular, the use of mid-year prices in the construction of RDS is arbitrary. We test the sensitivity of our findings to 
measuring RDS at alternative dates using both end-of-year and average of beginning- and end-of-year prices. Untabu- 
lated findings based on these alternative measures reveal that none of the inferences are affected. 
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Forecasting and Valuation Equations 


To examine how the dirty surplus and really dirty surplus components of income relate to 
equity value, we adopt the abnormal earnings forecasting and equity valuation equations from 
Barth et al. (1999), which are based on the linear information system developed in Oblson (1999): 


УСМ, = 00+ ао VCN. пъ DSi + @3RDS;, + w4BVE;, + Ejn- (4) 


MVE; = 00 + a, VCNI, + DSi + RDS y + a,BVE i, Ti. (5) 


Equation (4) is the abnormal earnings forecasting equation, where abnormal very comprehensive 
earnings, УСМ, is defined as very comprehensive earnings, УСМ, less a normal return on 
beginning equity book value, ВУЕ, 1, i.e., VCNI, ~ rBVE, 1. Very comprehensive income is net 
income, МІ, plus both dirty surplus and really dirty surplus. Following Chlson (1999) and Barth 
et al. (1999), УСМ, is partitioned into NI, DS,, and RDS,. The linear information system repre- 
sented by Equation (4) and Equation (5) implicitly assumes that currert earnings amounts are 
predictive of all future earnings. To the extent that this assumption is violated, the algebraic links 
between forecasting coefficients in Equation (4) and the valuation coefficients in Equation (5) 
described below do not necessarily hold. Of particular significance to this study is whether current 
realizations of DS and RDS are predictive of future УСМ that includes future realizations of these 
variables. 

In Equation (4), œ reflects the persistence of abnormal earnings. Prior research (e.g., Dechow 
et al. 1999; Barth et al. 1999, 2005) leads us to predict that w, is positive." The coefficients on the 
DS and RDS earnings components, œw, and оз, reflect the incremental effects on the forecast of 
abnormal earnings of knowing these components. If all earnings components have the same ability 
to forecast VCNI% then œ, and оз will both equal zero; thus, knowing each component of 
earnings does not aid in forecasting abnormal earnings. Ав a result, we test the null hypotheses that 
€» = 0 and оз =0 against the alternative that о # 0 and оз = 0. 

Following Ohlson (1999, 150), we define DS (RDS) as being “forecasting-irrelevant” if the 
quadruple (NI, , RDS, , BVE, , BVE, 1) ({NI,, DS, , BVE,, BVE,_,}) contains the same information 
as the quintuple (NI, , DS, , RDS, , BVE, , BVE, |) for purposes of forecasting УСА |. Because DS 
and RDS are components of УСМ, the total coefficients on DS, and RDS, are w; + œ, and ој 
+ ©з. w4 is not included in the total coefficient on either DS, or RDS, because BVE, is unchanged 
across the different definitions of clean surplus and is therefore invariant to the definition of clean 
surplus. Thus, if оу + о = 0 (w; + юз = 0), DS (RDS) is irrelevant for forecasting abnormal earn- 
ings. Conversely, if œ + 0 # 0 (c + оз 0), then DS (RDS) is said to have abnormal earnings 
“forecasting relevance.” To examine whether dirty surplus and really dirty surplus components of 
comprehensive income аге forecasting-irrelevant, we test the null hypotheses that w; + œw, = 0 and 
€ + оз = 0 against the alternatives that ој + о; # 0 and о; + оз #0. Note that с) reflects the 
forecasting relevance of ће УСАП – DS, — RDS, = NI, – rBVE, | component of УСМ. 

Equation (5) is the valuation equation based on the information dynamics in Equation (4). аҙ 
and аз, the valuation multiples on DS and RDS, can be interpreted in a symmetrical fashion. This 
follows from the fact that although DS, is by definition included in BVE,, it follows from Equation 
(1) and Equation (3) that RDS, that arises from dilutive transactions is nozmally included in BVE, 


a 


Ohlson (1995, 1999) permits the forecasting and valuation equations to include “other information.” Fairfield et al. 
(2003) show that accruals and asset growth have incremental ability to predict future return on assets. Accordingly, 
viewing accruals and asset growth as “other information," below we report findings from alternative specifications of 
Equation (4) and Equation (5) that include proxies for these variables as additional explanatory variables. 


The Accounting Review | January 2011 
American Accounting Association 


Do Investors Understand Really Dirty Surplus? 243 


as well.’ Analogous to the interpretation of œ, (өз) in Equation (4), аҙ (оз) reflects the incre- 
mental effect on valuation from knowing DS (RDS). If all earnings components are equally 
persistent, then they should have the same relation with equity value. If this is the case, then a, 
and a4 will equal zero, and knowing each component of earnings will not aid in explaining equity 
value. Thus, we test the null hypothesis that a; — 0 (аз = 0) against the alternative that a; #0 
(аз = 0). We define DS (RDS) as being  "valuation-irrelevant" if the quadruple 
{NI,, RDS, , BVE, , BVE, 4 (NI, , DS, , BVE, , BVE, ,)) contains the same information as the quin- 
tuple (NI, , DS, , RDS, , BVE, , BVE, 1} for purposes of valuation. Also analogous to Equation (4), 
the total valuation coefficient on DS (RDS) equals а) +a, (a; + аз). Thus, if o; 05-0 (a, 
+ œ = 0), DS (RDS) is irrelevant for valuation.) Conversely, if a, + а) # 0 (a, + a4 + 0), then DS 
(RDS) is “valuation-relevant.” Analogous to the interpretation of о) in Equation (4), a, reflects the 
value relevance of the VCNI ~ DS, - RDS, = М, — rBVE, 1 component of УСМ. 

Barth et al. (1999) derive a formula linking the coefficients in Equation (4) and the two 
suppressed equations with the coefficients in Equation (5). For our purposes, we are not interested 
in exact coefficient magnitudes based on imposing a full set of linear information dynamics. 
Instead, we are interested in the weaker prediction that the sign of a, + az (a, + аҙ) will be based 
on the sign of c, + о; (о + 03). 

Jf prices are determined rationally, then if DS or RDS is irrelevant for forecasting the next 
period amount, each should be valuation-irrelevant as well if the linear dynamics in Equation (4) 
and Equation (5) hold. Also, the sign of a, +a, (а; +аз) will be the same as the sign of «i 
+42 (w; + w). If we find apparent evidence of mispricing based on the empirical coefficients 
from estimating Equation (4) and Equation (5), then a buy-and-hold strategy of going long in 
relatively underpriced stocks and short in relatively overpriced stocks should yield excess returns. 

Any mismatch between the forecasting and valuation results for DS or RDS need not neces- 
sarily be attributable to its mispricing by investors. It might be the case, for example, that although 
RDS, cannot be used to forecast УСАП, it could be used to forecast VCNI} р (k = 2,3,...). If this 
were the case, then RDS would not be valuation-irrelevant, and the mismatch between forecasting 
and valuation coefficients would be attributable to variables omitted from both equations. The 
hedge returns tests provide a means of examining this issue. 

If transaction cost considerations imply that small firms are more difficult to price, then we 
would expect the hedge portfolio returns to be greater than in the case of larger firms. Therefore, 
we estimate and test predictions relating to Equation (4) and Equation (5) separately for small, 
medium, and large firms based on equity market value and conduct hedge return tests also sepa- 
rately for small, medium, and large firms. 


This can be illustrated by the following simple bond conversion example. Consider a firm that has a convertible bond 
outstanding on its books at $100 that is converted into shares worth $150 at time t. Under current GAAP, the share 
issuance will be recorded at $100. If we assume for simplicity that X, = DS, = 0 and that BVE, = $1,000, it follows that 
BVE, = 1,000 + 100 = $1,100. If this transaction were to be accounted for on а super-clean surplus basis, the share 
issuance would be recorded at $150, with the resultant cost of conversion appearing as RDS, = —50. We can deduce from 
Equation (3) that under super-clean surplus accounting BVE, = 1,000 — 50 + 150 = $1,100 as well. Although the calcu- 
lations are more complex in the case of employee stock options, the same conclusion applies. Note that in the case of 
mergers accounted for under pooling-of-interests, the inclusion of RDS, in BVE, would leave BVE, unchanged only if the 
asset (goodwill) associated with an acquisition was immediately expensed. 

Note that under the Ohlson (1999) framework, value irrelevance (e.g., which occurs for DS (RDS) when a, + @,=0 
(a, + аз =0)) of an earnings component implies that it has no impact on goodwill, which is the difference between 
equity market value and book value. Ohlson (1999, 152) further states that “ап incremental dollar of transitory earnings 
adds a dollar to market value. This claim is easy to validate as long as one keeps in mind that a dollar of transitory 
earnings also adds a dollar to book value." 
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Hedge Portfolio Strategy and Procedure 


Hedge Strategy Overview 


We determine the hedge portfolio strategy in the following manner. First, for each sample 
year, we rank firms according to either DS or RDS as a fraction of end-of-year equity book value, 
BVE. We then form ten portfolios whereby the first (tenth) portfolio contains those observations 
with the smallest (largest) fraction of DS or RDS. Second, within each of the ten DS or RDS 
portfolios we rank firms according to equity market capitalization and assign each firm to one of 
three equal-sized groups of firms comprising the small, medium, and large firms. This procedure 
results in there being ten portfolios within each of the three firm-size groups. It also ensures that 
the magnitude of DS or RDS does not vary systematically across the three firm-size groups, and 
thereby helps to mitigate the confounding effect of firm size when conducting our hedge portfolio 
tests." Third, we then combine observations from all sample years, retaining the firm size desig- 
nation and DS and RDS portfolio rankings. This results in there being three firm-size groups, 
within each of which there are ten D$ or RDS portfolios." Fourth, within each of the three 
firm-size groups, for each firm in the ten DS or RDS portfolios, we compute the risk-adjusted 
return over all sample years. Fifth, we compute the hedge return by deducting the equally 
weighted mean risk-adjusted return on the portfolio(s) comprising firms we expect to be most 
over-valued from the return on portfolio(s) comprising firms we expect to be either undervalued or 
least over-valued.!? 

We predict that over-valuation is most likely to occur for firms whose income is overstated 
relative to very comprehensive net income, and where the market fails to understand the economic 
implications of such overstatement. As noted in Section II, we expect these conditions to be more 
descriptive for RDS than DS. Recall that RDS is generally non-positive because the accounting 
procedures that give rise to RDS arise from equity transactions that generally are recorded at less 
than market value. Our hedge strategy is therefore long in firms with least negative RDS and short 
in firms with most negative RDS. We employ a similar strategy for DS, i.e., go long in firms with 
most positive DS and short in firms with most negative DS. As noted above, we do not expect this 
DS-based hedge strategy to yield significant positive (or negative) excess returns. 

Following Bernard and Thomas (1990), we compute the hedge return for each of the three 
firm-size groups by going long (short) in the firms in the top three (battom three) DS or RDS 
portfolios. Combining observations in the top three and bottom three DS or RDS portfolios confers 
the benefit of mitigating the potential effects of measurement error in the extreme DS or RDS 
portfolios. We employ the hedge portfolio tests to complement the tests based on Ше forecasting 
and valuation equations. In particular, if the forecasting and valuation equations yield evidence of 
mispricing, notably undervaluation of DS or RDS, then the hedge portfolio tests should yield 
evidence that excess returns can be earned by exploiting such undervaluation. Conversely, if the 
forecasting and valuation equations yield no evidence of mispricing, then the hedge portfolio tests 


? Untabulated findings based on DS and RDS deflated by total assets result in no changes in inferences. 

By design, this procedure is a double-conditional sort of first RDS (or DS) then size. As a consequence, this procedure 
can fail to adequately control for size differences between long and short RDS portfolics. To assess the sensitivity of 
hedge returns, we reversed the sorting procedure and recomputed hedge returns sortirg firm-years using a double- 
conditional sort of first size then RDS. Hedge return findings based on this alternative procedure result in inferences that 
are substantially the same as those based on the tabulated hedge return findings. 

!! Because firm size is increasing during our sample period, some large firms in early sample years would be considered 
small firms in later sample years. However, because firm-size groupings are determined annually, our procedure miti- 
gates year effects on our hedge portfolio test inferences. 

12 Untabulated findings based on hedge returns computed with value-weighted portfolio risx-adjusted returns result in no 
change in inferences. Additional untabulated findings based on cumulative abnormal retucns also result in no change іп 
inferences. 
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should yield evidence that excess returns cannot be earned following our hedge strategy. 


Hedge Strategy Implementation Details 

To estimate risk-adjusted return, we need a measure of expected stock return. Following Ang 
and Liu (2004), Ibbotson and Associates (2005), Massa et al. (2005), and Barth et al. (2008), we 
use the Fama and French (1993) three-factor model, supplemented with the momentum factor 
(Jegadeesh and Titman 1993; Carhart 1997), with time-varying factor loadings, risk-free rates, and 
risk premia. We calculate each firm's expected equity return for month #+1 as of month 7, ER; 1, 
conditional on the expected factor returns in month +1, based on Equation (6): 


ER; = Крња Вамағды Ruger — Күні)“ бємв,ы+А®МВ у + Bur МІ, 
+ Buoy i МОМА, (6) 


where #вмккди+1› Взмвынь Ёнмыы+› and Вџмомлн are firm-specific coefficients estimated from 
Equation (7) below. Ку ы ~ Ку, SMB,4, HML,,;, and MOM,,, are the expected monthly Fama- 
French and momentum factor returns for month t+1. We estimate the expected monthly factor 
returns for month ¢ by calculating each factor’s average monthly return over the 60 months prior 
to month /. The difference, Ку, — К is the monthly return of the market portfolio in excess of the 
risk-free rate, HML, and SMB, are the monthly returns to the book-to-market and size factor- 
mimicking portfolios, respectively, as described in Fama and French (1993), and MOM, is the 
monthly return to the momentum factor-mimicking portfolio. The risk-adjusted return for firm i in 
month [+1 is the difference between the firm's realized return in month t+1, R; „1, and its expected 
return, ЕК, „ү. We then use these monthly risk-adjusted returns to compute annual returns. In the 
hedge return tests, we cumulate return three months after fiscal year-end to ensure that the finan- 
cial statements are available to the public. 

For each firm, we estimate the betas associated with the firm's return to each of the Fama- 
French and momentum factors by estimating the following monthly time-series regression: 


Rir Кр ait Brurr (Ем, T Ry) + Bsup МВ, + ВнміНМІ,%- Вмом, MOM, + Eip (7) 


where R;, — Ry, is the firm's monthly return in excess of (һе risk-free rate. We estimate Equation 
(7) using the most recent 60-month returns prior to the month t. This results in estimated coeffi- 
cients, Bisnis Ёѕмвь» Bann and Ênom.» which are updated monthly. We define our forecast 
of each factor beta for month 1-1 using the fitted value for that factor for month 7, e.g., 
Ввмтғ,,+1  Ввмтвдые 

Following Doyle et al. (2003), we compute hedge returns over one-, two-, and three-year 
horizons. We conjecture that if hedge returns continue to increase over longer horizons, then such 
evidence would be indicative of unmodeled risk differences. Therefore, we expect hedge returns to 
flatten over the three-year horizon. To avoid imposing the assumption of normality of the distri- 
bution of excess returns, we report an additional test for significance of the hedge returns using a 
t-test based on a boot-strapping procedure. Specifically, we select firm observations that we ran- 
domly assign to the ten portfolios. We then calculate the hedge return. We repeat this procedure 
1,000 times, thereby generating an empirical distribution that we use to report empirical p-values 
in addition to conventional t-statistics and their implied p-values. 


IV. SAMPLE AND DESCRIPTIVE STATISTICS 
We obtain most of the data for estimation of Equation (4) and Equation (5) for 1976-2006 
from the Compustat Primary, Secondary, and Tertiary, Full Coverage, and Research Annual In- 
dustrial Files. DS, and RDS, and are calculated using Compustat and CRSP data as described in the 
“Computation of DS and RDS” section. We compute УСМ! as VCNI, — rBVE, 1, where VCNI, 
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includes both DS, and RDS,. Following Barth et al. (1999, 2005), Dechow et al. (1999), Bell et al. 
(2002), and Landsman et al. (2007), we set r, the cost-of-equity capital, equal to 12 percent, and 
we require sample firms to have positive equity book value. We also require that sample firms 
have total assets in excess of $10 million to avoid the undue influence of small firms. To mitigate 
the effects of outliers, for each variable within each of the three size categories, we treat as missing 
observations that are in the extreme top and bottom one percentile. For eazh sample year, firms are 
ranked according to end-of-year market value of common equity and assigned into one of three 
equal-sized groups of firms comprising the small, medium, and large firms. We estimate Equation 
(4) and Equation (5) using unscaled data (Barth and Kallapur 1996). We assess significance of 
regression coefficients using two-way clustered standard errors, with firm and year clusters (Pe- 
tersen 2009; Gow et al. 2010).'* The final sample for estimation of Equation (4) and Equation (5) 
comprises 37,097 firm-year observations. 

We obtain stock return, R, from CRSP and Ry, the one-month Treasury rate, and the Fama- 
French and momentum factor returns from the Fama-French database  (http:// 
mba.tuck.dartmouth.edu/pages/faculty/ken.french/data, library.html). To obtain excess returns per 
Equation (6), we estimate factor loadings from Equation (7) using monthly return data beginning 
in 1972.1 There аге 30,383 potential DS firm-year excess return observations. However, because 
there are 17,579 observations with zero DS, we limit our DS hedge return analysis to the 12,804 
nonzero observations. There are 28,346 RDS firm-year excess return observations. 

Table 1, Panels A-D present distributional statistics and Panel E presents Pearson and Spear- 
man correlations. Panels A—D reveal that, on average, the market value of equity exceeds the book 
value of equity for all size firms and mean abnormal earnings, УСМИ, is positive for large firms 
but negative for medium and small firms. Table 1, Panel E reveals that the explanatory variables 
in Equation (4) and Equation (5) are correlated with each other, but not so much as to raise 
collinearity concerns. Although the distributional statistics reported in Panels A-D reveal the 
variables are skewed, none of the key inferences are affected when the equations are estimated on 
a per-share basis. 

Because typically РМ > P^ and М, > М, 1, we expect RDS to be negative. Table 1, Panels 
A-D reveal that RDS turns positive before the 75th percentile. Untabulated statistics reveal that 
RDS turns positive between the 60th and 70th percentiles for two-thirds of the sample years, and 
beyond the 70th percentile for the remaining third. Because it is unlikely that equity transactions 
will give rise to positive RDS, this means that at least some of our observations are measured with 
error. Assuming this error is unsystematic, the implication of this is a reduction in power of our 
tests, particularly those relating to the hedge returns. 

Untabulated statistics reveal that mean RDS is economically largest (1.е., most negative) for 
Pharmaceuticals, Services, Food, and Computers, and that mean RDS is economically largest in 
1997, 1998, and 2002. To mitigate the impact of particular industries or years overly influencing 
our results, below we report supplementary findings from hedge return tests that exclude those 
industries and years with the largest mean RDS values. 


13 None of our inferences are affected by assuming alternative values for ғ, including firm-specific values based on our 
multi-factor model. 

14 We also compute significance levels using bootstrapping. Untabulated findings result in no changes in inferences from 
those based on reported findings for RDS. For DS, the forecasting and valuation coefficients are still consistent, but with 
both significantly positive forecasting and valuation coefficients for small and medium Zirms. 

Although excess returns can be computed through 2006, our sample stops in 2003. This is because we compute hedge 
returns for one-, two-, and three-year horizons, and to facilitate comparison over returns time, we use a common sample 
for the full three-year horizon. 
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V. RESULTS 
Forecasting Equations 


Table 2, Panel A presents regression summary statistics from estimating Equation (4). We 
employ separate estimations for small, medium, and large firms and the pooled sample. Panel A 
reveals in all cases, the forecasting coefficient for abnormal very comprehensive income, o, is 
significantly positive. It is also increasing in firm size, which is consistent with greater persistence 
for larger firms. 

The incremental forecasting coefficient for DS, «», is significantly different from zero for only 
the large firms. More importantly, the total DS forecasting coefficient, а) + c», is insignificantly 
different from zero for all three groups of firms and for the pooled sample (t-statistics — 0.46, 
0.98, 0.59, and 0.59).16 These findings indicate that DS is forecasting-irrelevant for УСМЕ for all 
firms. If investors correctly understand the implications of these persistence findings for valuation, 
then we should observe valuation irrelevance of DS for all firms, i.e., DS should have a zero total 
valuation coefficient in the valuation equation. 

The incremental forecasting coefficient for RDS, оз, is significantly negative in all cases. The 
total RDS forecasting coefficient, w, + өз, is insignificantly different from zero for all size firms 
and the pooled sample (i-statistics = 1.29, 0.84, —0.85, and —0.81). These findings indicate that 
RDS is forecasting-irrelevant for VCNI for all firms. As with DS, if investors correctly understand 
the implications of these persistence findings for valuation, then we should observe valuation 
irrelevance of RDS for all firms. 


Valuation Equations 


Table 2, Panel B reveals the valuation coefficient for VCNI”, o, is significantly positive in all 
cases. It is also increasing in firm size, ranging from 1.05 for small firms to 7.83 for large firms, 
which is consistent with the pattern of increasing persistence displayed in Table 2, Panel А. 

The incremental valuation coefficient for DS, аҙ, is insignificantly different from zero for 
small and medium firms and significantly negative for large firms. More importantly, its total 
coefficient, a, + аҙ, is also insignificantly different from zero for all three groups as well as the 
pooled sample (t-statistics = 1.03, 1.10, —0.39, and —0.43). This finding is expected based on the 
findings about the lack of persistence for DS revealed in Table 2, Panel A. 

The incremental valuation coefficient for RDS, o4, is significantly negative in all cases. Its 
total coefficient, a, + аз, is also significantly negative for all groups (t-statistics = —5.45,—8.37, 
—4.76, and —4.93). Based on the forecasting coefficient findings in Table 2, Panel A, we expect 
to observe a + оз to be insignificantly different from zero for all size firms. However, finding that 
о + оз < 0 implies that an incremental dollar of RDS increases market value by less than a 
dollar." Thus investors appear to undervalue the RDS component of income, 1.е., over-value 
equity. 


16 Throughout we use a 0.05 significance level under a two-sided alternative when evaluating statistical significance. 

17 Note that the market's treatment of RDS as having negative persistence can be restated as implying that the market 
views the benefits to the firm from a transaction that give rise to RDS as exceeding its RDS costs. For this to occur, the 
market would have to believe that benefits would flow to the firm in future periods at a level beyond that which could 
be inferred from the time-series properties of residual income. In other words, RDS would play two roles—being both 
a current period cost and a proxy for an Ohlson-type other information variable. An example of this is the market 
believing an intangible asset arising from employee stock options is greater than the dilution cost to current 
shareholders. 

13 The findings in Table 2 could be attributable to variables predictive of future earnings and returns that are correlated 
with RDS. Fairfield et al. (2003) identify two such accounting-based variables: short-term accruals and growth in net 
operating assets. Untabulated findings reveal that inclusion of these variables in Equation (4) and Equation (5) does not 
affect any inferences we draw from Table 2. 
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Dirty Surplus Hedge Returns 


Table 3 reports buy-and-hold Fama-French risk-adjusted stock returns for firms in the top and 
bottom three deciles of firms classified according to the (signed) magnitude of DS as a fraction of 
equity book value at the beginning of the cumulation period. Results are presented separately for 
small, medium, and large firms, and for the pooled sample. The table presents mean returns for 

. one-, two-, and three-year horizons and the median values of DS/BVE for each group, as well as 
hedge returns and associated t-statistics and empirical p-values. 

The findings indicate that for medium and large firms, and for the pooled sample, mean 
risk-adjusted returns and associated hedge returns for all three horizons are essentially zero. 
However, for small firms, the mean risk-adjusted returns are positive and monotonically increasing 
over the three-year horizon, suggesting that Fama-French risk adjustments may not perfectly 
eliminate the pricing effects of risk. Nonetheless, the small firm hedge returns are zero, indicating 
that the mismeasurement of expected return for small firms is unrelated to assignment of obser- 
vations to DS portfolios. The small firm hedge returns being zero is consistent with investors 
pricing DS correctly. 

Taken together, the findings in Tables 2 and 3 suggest little evidence of mispricing of stocks 
based on the magnitude of DS. Alternatively, the findings suggest investors make no adjustments 
to reflect DS items, but this has no pricing implications because when earnings items are transitory 
they should only be impounded into price as a result of being included in the book value of equity. 


Really Dirty Surplus Hedge Returns 


Table 4 reports buy-and-hold Fama-French risk-adjusted stock returns for firms in the top and 
bottom three deciles of firms classified according to the (signed) magnitude of RDS as a fraction 
of equity book value at the beginning of the cumulation period. Table 4 presents analogous 
statistics to those presented in Table 3, but are based on RDS rather than DS. Results are presented 
separately for small, medium, and large firms, and for the pooled sample. 

In contrast to Table 3, the findings in Table 4 generally indicate that risk-adjusted returns 
differ from zero for all firm groups across all three horizons. In addition, the returns move 
increasingly away from zero in absolute terms over the investing horizon for almost every port- 
folio. This is particularly pronounced for small firms, for which the one-, two-, and three-year 
risk-adjusted returns for the top 30 percent (bottom 30 percent) portfolios are 0.08, 0.12, and 0.17 
(0.03, 0.06, 0.08). Recall that the excess returns for small firms in Table 3 also are positive and 
almost identical for firms in the bottom and top 30 percent portfolios, which we attribute to the 
difficulty of measuring expected return for small firms. We can therefore treat the excess returns 
for small firms reported in Table 3 as a benchmark for the measurement error in expected return. 
Using this approach, we can deduct the average of the two portfolio returns for each investing 
horizon to arrive at a better estimate of risk-adjusted returns for the small firms. This results in 
one-, two-, and three-year risk-adjusted returns for the top 30 percent (bottom 30 percent) port- 
folios of 0.01, 0.02, and 0.02 (—0.06,—0.04, —0.07). Note that these additional risk adjustments 
for the small firms have no effect on their hedge returns because the same adjustment is made to 
both the long and short positions.’ 

Turning to the hedge returns, Table 4 reveals that they are significantly positive in all three 
firm-size groups and for the pooled sample. Table 4 also reveals that the hedge returns are 


19 А similar adjustment to excess returns for medium and large firms could be made. However, Table 3 indicates that 
excess returns for medium and large firms are bounded between —0.01 and 0.02, which suggests that measurement error 
in expected returns is relatively immaterial for these groups of firms. 
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TABLE 3 


Mean Fama-French Risk-Adjusted Stock Returns in Top and Bottom Three Deciles of 
Non-Zero Dirty Surplus Deflated by Book Value of Owner Equity for a Sample of 12,804 
Firm-Year Observations 


1976-2003%) 
No. of Obs. MVE DS/BYE 1-Year 2-Year 3-Year 
Small Firms Portfolio? 
Bottom 30% 1,314 162.30 —0.022 0.08 0.11 0.16 
Top 30% 1,303 146.09 0.016 0.07 0.10 0.15 
Hedge Return? ~0.01 0.00 0.00 
t-stat. —0.51 —0.13 —0.07 
Empirical p-value? 0.72 0.56 0.54. 
Medium Firms Portfolio" 
Bottom 30% 1,279 1,075.51 —0.021 —0.01 0.00 0.01 
Тор 30% 1,272 934.83 0.016 0.02 0.01 0.00 
Hedge Return? 0.03. 0.00 —0.02 
t-stat. 171 0.20 —0.62 
Empirical p-value 0.03 0.39 0.73 
Large Firms Рог оно“ 
Bottom 30% 1,260 14,073.49 —0.023 0.00 0.00 0.00 
Top 3096 1,254 12,525.09 0.016 0.01 —0.01 -0.01 
Hedge Return? 0.00 —0.01 —0.01 
t-stat. 0.34 —0.46 —0.55 
Empirical p-value® 0.36 0.71 0.73 
Pooled Sample Portfolio" 
Bottom 3096 3,853 5,014.65 —0.022 0.03 0.04 0.06 
Top 30% 3,829 4,462.24 0.016 0.03 0.03 0.05 
Hedge Return? 0.01 —0.00 —0.01 
t-stat. 0.55 —0.19 —0.49 
Empirical p-value® 0.33 0.58 0.67 


^ Indicates hedge return is significantly different from zero at less than the 0.10 level. 

* See Table 1 for definitions of all variables. 

> Fama-French risk-adjusted return is a firm's actual return in excess of the risk-free rate less the firm's predicted return 

based on the Fama-French factor and momentum factor mimicking portfolios, i.e., excess market return, size, book-to- 

market, and momentum factor. 

Firm’s size designation and DS portfolio ranking are assigned in the following procedure: for each sample year, firms are 

ranked according to DS as a fraction of end of year equity book value, BVE, and assigned into ten equal-sized portfolios 

whereby the first (tenth) portfolio contains those observations with the smallest (largest) fraction of DS; within each of 

the ten DS portfolios, firms are ranked according to firm size, i.e., equity market value, and are assigned to one of three 

equal-sized groups of firms comprising the small, medium, and large firms. 

The hedge return is computed by deducting the mean risk-adjusted return on the bottom three deciles portfolio from that 

on the top three deciles portfolio. The strategy implementation begins three months subsequent to the firm’s fiscal 

year-end. 

* The proportion of the hedge returns from 1,000 simulations exceeds the observed DS-based hedge return. In a simula- 
tion, each firm is assigned a random number as the substitute for DS, and accordingly the portfolio ranking and size 
designation following the procedure in table note c. 
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TABLE 4 


Mean Fama-French Risk-Adjusted Stock Returns in Top and Bottom Three Deciles of 
Really Dirty Surplus Deflated by Book Value of Owner Equity for a Sample of 28,346 
Firm-Year Observations 
1976-2003^^ 


No. of Obs. MVE RDS/BVE 1-Year 2-Year 3-Year 


Small Firms Portfolio" 


Bottom 3096 2,870 114.23 —0.046 0.03 0.06 0.08 
Тор 30% 2,860 54.91 0.004 0.08 0.12 0.17 
Hedge Return? 0.06 0.07 0.09 
t-stat. 3.61* 2.61* 2.81* 
Empirical р-уаше” «0.01 0.01 «0.01 
Medium Firms Portfolio 
Bottom 30% 2,834 670.28 —0.046 -002 —005 —0.07 
Тор 30% 2,824 351.69 0.004 0.01 0.01 0.00 
Hedge Return? 0.03 0.06 0.07 
tstat — 2.71 3.92% 3,94% 
Empirical p-value? «0.01 «0.01 «0.01 
Large Firms Portfolio 
Bottom 30% 2,812 10,156.35 —0.047 -002 -004 —0.06 
Top 30% 2,805 5,625.89 0.004 0.01 0.02 0.02 
Hedge Return? 0.04 0.06 0.08 
t-stat. 4.25% 5.39* 5.83% 
Empirical p-value® «0.01 «0.01 «0.01 
Pooled Sample Portfolio 
Bottom 30% 8,516 3,615.20 —0.046 —0.01 -0.01 -0.02 
Тор 30% 8,489 1,994.44 0.004 0.04 0.05 0.07 
Hedge Return? 0.04 0.06 0.08 
t-stat. 5.84* 5.81* 6.00* 
Empirical p-value* «0.01 «0.01 «0.01 


* Indicates hedge return is significantly different from zero at less than the 0.05 level. 


a 


b 


See Table 1 for definitions of all variables. 


Fama-French risk-adjusted return is a firm's actual return in excess of the risk-free rate less the firm's predicted return 
based on the Fama-French factor and momentum factor mimicking portfolios, i.e., excess market return, size, book-to- 
market, and momentum factor. 


Firm's size designation and RDS portfolio ranking are assigned in the following procedure: for each sample year, firms 
are ranked according to RDS as a fraction of end of year equity book value, BVE, and assigned into ten equal-sized 
portfolios whereby the first (tenth) portfolio contains those observations with the smallest (largest) fraction of RDS; 
within each of the ten RDS portfolios, firms are ranked according to firm size, 1.е., equity market value, and are assigned 
to one of three equal-sized groups of firms comprising the small, medium, and large firms. 

The hedge return is computed by deducting the mean risk-adjusted return on the bottom three deciles portfolio from that 
on the top three deciles portfolio. The strategy implementation begins three months subsequent to the firm's fiscal 
year-end. 


The proportion of the hedge returns from 1,000 simulations exceeds the observed RDS-based hedge return. In a 
simulation, each firm is assigned a random number as the substitute for RDS, and accordingly the portfolio ranking and 
size designation following the procedure in table note c. 
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increasing over time. For example, for small firms, the one-, two-, and three-year hedge returns are 
0.06, 0.07, and 0.09. It is possible that the increasing hedge returns over time could be attributable 
to an unmodeled risk Ѓасіот.202! 


Additional RDS Hedge Return Tests 


In this section, we consider several additional tests to examine the sensitivity of the RDS 
hedge returns to previously documented pricing anomalies and risk factors, and to alternative 
procedures for computing those returns. We also attempt to determine the extent to which different 
source components of RDS account for our hedge return results. 

First, we investigate whether the persistence of the RDS hedge returns could reflect the effects 
of pricing anomalies documented in prior research. We consider each anomaly, in turn, by first 
placing observations into one of ten portfolios based on the magnitude cf each anomaly factor. 
Then, within each portfolio, we rank observations according to the magnitude of RDS as a fraction 
of BVE, assigning each observation within each anomaly portfolio to one of ten RDS portfolios. 
This double-sorting process helps to ensure that our findings are not the result of the particular 
anomaly used in the initial sort. 

The pricing anomalies we consider are: the short-term accruals anomaly (Sloan 1996; Xie 
2001); the growth in long-term asset accrual anomaly (Fairfield et al. 2003); the long-term pricing 
reversal anomaly (Daniel and Titman 2006; Fama and French 20082), and the share repurchase 
and issuance anomaly (Ikenberry et al. 1995; Daniel and Titman 2006; Mitchell and Stafford 2000; 
Fama and French 20082). The share repurchase and issuance anomaly is potentially most closely 
related to the RDS pricing anomaly we document because the latter can only arise from equity 
transactions. Untabulated findings reveal that RDS hedge returns remain significantly positive after 
controlling for the potential confounding effects of each of the four anomalies. Thus, mispricing 
associated with previously documented anomalies appears not to account for our finding investor 
mispricing of RDS. 

Second, we consider an alternative to our buy-and-hold procedure for computing RDS hedge 
returns. Following Mashruwala et al. (2006), we estimate hedge returns using a monthly calendar- 
time portfolio approach (Fama and French 1993). Under this approach, hedge returns are calcu- 
lated based on Jensen's alphas from monthly time-series regressions of hedge portfolio excess 
returns on Fama-French and momentum factor returns. For each sample year, we assign firms into 
decile portfolios based on RDS as a fraction of equity book value and compute the mean monthly 
portfolio return for firms in the top and bottom three deciles. We then estimate the following 
monthly time-series regression for both the high and low RDS portfolios: 


R 


ot Ry ар + Bo(Ruy— Rp) + SpSMB,+ h HML, + m,MOM, + ва (8) 


The resulting Jensen's alpha, а,, measures the mean monthly return for portfolio p not attributable 
to the Fama-French and momentum factor returns. We predict that a, for the high RDS portfolio 
is larger than that for the low RDS portfolio. We formally test this by estimating Equation (8) for 
a hedge portfolio, which is constructed as the difference in mean monthly excess returns for the 


?! One possible candidate is firm size, as Table 4 indicates that firms in the bottom 30 percent RDS portfolios are roughly 
twice as large as those in the top 30 percent. To assess the importance of firm size on hedge returns, we regressed excess 
return on firm size and an indicator variable for whether a firm-year observation is in the top 30 percent RDS portfolio. 
Untabulated findings indicate that the indicator variable coefficient is significantly positive in all cases and over all 
horizons, and the coefficient on size is insignificant in all specifications. 

! [n addition, it is possible that significant hedge returns are induced by our implementation of risk-adjusting returns. 
Untabulated hedge returns computed without explicit adjustment for risk yield inferences are consistent with those based 
on the tabulated findings. 
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high RDS and low RDS portfolios,” and then testing whether the resulting Jensen’s alpha is 
significantly positive. 

Consistent with our predictions, untabulated findings indicate that Jensen’s alpha for the high 
RDS portfolio is larger than that for the low portfolio for all three firm-size groups: 0.005 versus 
0.003 (small firms); 0.001 versus 0.000 (medium firms); 0.001 versus 0.000 (large firms). For the 
pooled sample, the high and low portfolio Jensen’s alpha are 0.002 and 0.000, respectively. The 
Jensen’s alpha of the hedge portfolio is statistically significant for only the small firms and the 
pooled sample, 0.003 (t-statistic = 2.25; p-value = 0.03) and 0.002 (t-statistic = 2.27; p-value = 
0.02), which in annual terms indicate excess returns to the RDS-based hedging strategy of 3.7 
percent and 2.4 percent, respectively. These findings contrast with those from the buy-and-hold 
hedge returns in Table 4, which indicate hedge returns are positive for all three firm-size groups. 
Ascertaining which approach yields the more reliable results is not straightforward. For example, 
the buy-and-hold approach has the advantage of updating the individual stock’s expected return on 
a monthly basis using out-of-sample estimation, and the alpha approach assumes that factor betas 
are constant during the test period.” 

Third, we attempt to determine the extent to which different source components of RDS 
account for our hedge return results. Findings from the forecasting and valuation tests are consis- 
tent with, in the aggregate, RDS components of very comprehensive net income being transitory, 
but investors failing to understand this and over-valuing equity. Ideally, we would like to identify 
separate components of RDS, determine their persistence, and then ascertain which components 
investors appear to fail to price correctly. Because we face the same data limitations as investors, 
we can only do this indirectly by sequentially excluding and including firm-years in which RDS is 
more likely to be attributable to one particular type of transaction. First, to focus on the potential 
pricing effects of pooling-of-interests transactions, we recomputed hedge returns (1) excluding 
firm-years with corporate mergers or acquisitions, and (2) using only firm-years with corporate 
mergers.” Second, to focus on the potential pricing effects of employee stock options, we recom- 
puted hedge returns using (1) only observations beginning in 1995, and (2) only observations 
before 1995. Because SFAS No. 123 required disclosure of weighted average exercise price for 
employee stock options after 1995, hedge returns might be expected to fall or even disappear in 
this latter period. Third, to focus more generally on the pricing effects of dilutive transactions, 
including warrants, convertible instruments, as well as stock options, we recomputed hedge re- 
turns for portfolios sorting firm-year observations on the difference between basic and diluted 
earnings per share as a fraction of equity book value. Finally, because RDS is concentrated in 
particular industries and years, as noted in Section IV, we sequentially calculated hedge returns 
excluding the six industries and the five years with the most negative RDS. 

Untabulated findings from each of these additional analyses reveal that hedge returns remain 
significant in all cases. These findings are consistent with our tests lacking power to trace the 
precise sources of RDS. They are also consistent with RDS being correlated with an unmodeled 
risk factor. 


22 The independent variables (risk factors) are the same for each portfolio and are therefore included in the hedge portfolio 
without adjustment. 

For a discussion of the strengths and weaknesses of the various approaches, see Fama and French (2008b). 

% Ideally, we would compute hedge returns for subsamples excluding and including only those firm-years with mergers 
accounted for under pooling-of-interests. However, because such identifying information is only available to us using 
the Securities Data Corporation beginning in the middle of our sample period, we elected to cast the net wider to take 
account of mergers and acquisitions during our sample period. 
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VI. CONCLUSIONS 

The question we address in this study is whether firms' share prices correctly reflect two 
accounting measures, dirty surplus and really dirty surplus. We find that both dirty surplus and 
really dirty surplus are forecasting-irrelevant for abnormal very comprehensive income for all 
firm-size groups. Taking these results at face value (i.e., assuming that the forecasting and valu- 
ation equations correctly capture the time-series properties of these varizbles), if investors cor- 
rectly understand the implications of these persistence findings for valuation, then each kind of 
dirty surplus should be valuation-irrelevant for all firms. This prediction is borne out in the case of 
dirty surplus. In contrast, the findings indicate that investors appear to undervalue really dirty 
surplus, which is consistent with the premise that investors are unablé to assess the economic 
implications of really dirty surplus transactions. However, the possibility remains that the mis- 
match of the really dirty surplus forecasting and valuation coefficients is the result of model 
misspecification rather than mispricing. 

Our buy-and-hold hedge return results support the findings from the tests linking the fore- 
casting and valuation equations. Hedge returns are insignificantly different from zero when based 
upon dirty surplus, regardless of firm size and investing horizon. In contrast, buy-and-hold hedge 
returns based on really dirty surplus are significantly positive for all three frm-size groups as well 
ав for the pooled sample. Findings from additional tests reveal that inferences relating to hedge 
returns are insensitive to an alternative procedure to measuring hedge returns, to including controls 
for four previously identified mispricing anomalies, and to sampling procedures designed to focus 
on sources of really dirty surplus. 

Taken together, the findings are consistent with investors failing to understand the lack of 
persistence of really dirty surplus, and therefore apparently over-valuing firms that have large 
negative really dirty surplus. However, because we are unable to trace the sources of really dirty 
surplus to particular types of equity transactions, we cannot determine the extent to which poten- 
tial mispricing arises from each type of transaction. However, even if we could trace the sources 
of really dirty surplus, any resulting hedge returns might still be attributable to an unmodeled risk 
factor. 
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ABSTRACT: This study examines whether differences in proxies for audit quality be- 
tween Big 4 and non-Big 4 audit firms could be a reflection of their respective clients' 
characteristics. In our analyses, we use three audit-quality proxies—discretionary ac- 
cruals, the ex ante cost-of-equity capital, and analyst forecast accuracy—and employ 
propensity-score and attribute-based matching models in attempt to control for differ- 
ences in client characteristics between the two auditor groups while estimating the 
audit-quality effects. Using these matching models, we find that the effects of Big 4 
auditors are insignificantly differeni from those of non-Big 4 auditors with respect to the 
three audit-quality proxies. Our results suggest that differences in these proxies be- 
tween Big 4 and non-Big 4 auditors largely reflect client characteristics and, more 
specifically, client size. We caution the reader that this study has not resolved the 
question, although we hope that it encourages other researchers to explore alternative 
methodologies that separate client characteristics from audit-quality effects. 
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I. INTRODUCTION 

he auditing literature generally concludes that the audit quality of Big 4 auditors is superior 

| to that of non-Big 4 auditors." DeAngelo (1981) argues that accounting firm size is a proxy 

for auditor quality, as no single client is important to larger accounting firms and, hence, 

larger accounting firms are less likely than smaller accounting firms to compromise their indepen- 

dence. Dopuch and Simunic (1980) propose that larger accounting firms provide higher quality 

services because they have greater reputations to protect. Furthermore, it could be argued that Big 

4 firms provide superior audit quality as their sheer size can support more robust training pro- 

grams, standardized audit methodologies, and more options for appropriate second partner re- 
views. 

However, there are also arguments as to why Big 4 and non-Big 4 firms could provide 
comparable audit quality. First, Big 4 and non-Big 4 firms are held to the same regulatory and 
professional standards, and thus both types of audit firms must adhere to a reasonable level of 
quality. Second, as "non-Big 4 auditors have superior knowledge of local markets and better 
relation with their clients" (Louis 2005, 77), these factors may enable non-Big 4 firms to better 
detect irregularities. Of course, the converse argument could be made that closer relationships 
among non-Big 4 accounting firms and their clients could potentially lead to a compromise of 
independence; however, the net effect of these counteracting forces is unclear. Third, the inability 
of non-Big 4 firms to obtain affordable insurance coverage may actually increase the audit effort 
of non-Big 4 firms relative to Big 4 firms because smaller audit firms cannot obtain a similar level 
of backing from insurance companies. This notion is supported by the U.S. Government Account- 
ability Office (GAO) report issued in 2008 indicating that non-Big 4 auditors are struggling to 
obtain affordable liability insurance coverage (GAO 2008, 55). Moreover, the previous GAO audit 
firm concentration report issued in 2003 finds that "smaller firms lacked the size needed to achieve 
economies of scale to spread their litigation risk and insurance costs across a larger cápital base" 
(GAO 2003, 49), suggesting that insurance fees are a fixed cost to audit partners. Finally, CPAs 
frequently switch between Big 4 and non-Big 4 firms, and knowledge transfers can dilute the 
potential for one type of audit firm to become superior. For example, upon the fallout of Arthur 
Andersen (AA), Grant Thornton acquired AA's Albuquerque, Charlotte, Columbia, Greensboro, 
and Orlando offices and separately hired many former middle-market AA audit partners from other 
offices (Dow Jones News Service 2002). Hence, it is not obvious from thecry or intuition that Big 
4 firms should be superior to non-Big 4 firms. 

Nonetheless, numerous empirical studies, following the theoretical support of DeAngelo 
(1981) and Dopuch and Simunic (1980), and using a variety of audit-quality proxies, find evidence 
suggesting that Big 4 auditors provide higher-quality audits than non-Big 4 auditors (e.g., Palm- 
rose 1988; Becker et al. 1998; Khurana and Raman 2004; Behn et al. 2008). Given that the 
distributions of client characteristics significantly differ across Big 4 and non-Big 4 firms, an 
important consideration of these prior studies is whether the empirical findings simply reflect 
client and not accounting firm characteristics. Our research objective is to investigate whether the 
Big 4 treatment effect may be attributable to client characteristics. 

The question of Big 4 superiority is important, given that many studies rely on the Big 4 
versus non-Big 4 distinction as an audit-quality proxy. Hence, it is prudent to confirm that this 
distinction does not simply reflect client characteristics.? Furthermore, incorrectly classifying Big 


We use Big 4 as a generic term encompassing the Big 8, Big 6, Big 5, and Big 4 to reflect the consolidation of these 
firms. | 

Several studies outside the auditing literature have used the Big 4 distinction to proxy for a higher level of audit quality; 
for example, Beatty (1989), Guenther and Willenborg (1999), Mitton (2002), Smart and Zutter (2003), and Gul et al. 
(2009). 
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4 auditors as superior to non-Big 4 auditors has unnecessary negative ramifications for smaller 
auditors, such as audit committee's auditor selection bias (CFA Institute Center 2009) and dis- 
criminatory clauses in loan and underwriting agreements (DeAngelo 1981), which could result in 
a loss of current and future clients. 

Using three audit-quality proxies—discretionary accruals, the ex ante cost-of-equity capital, 
and analyst forecast accuracy—we replicate previous findings that document differences in quality 
between Big 4 and non-Big 4 audit firms. Following Rosenbaum and Rubin (1983), Li and 
Prabhala (2007), and Francis et al. (2010), we use propensity-score matching models in an attempt 
to control for differences in client characteristics between the two auditor groups while estimating 
auditor treatment effects. Moreover, we match on specific client characteristics (attribute-based 
matching), such as client size and return on assets, to identify whether Big 4 versus non-Big 4 
differences in audit-quality proxies can be attributed to specific client characteristics. Using these 
matching models, we find that the treatment effects of Big 4 auditors are insignificantly different 
from those of non-Big 4 auditors with respect to our three audit-quality proxies. Furthermore, the 
attribute-matching results provide evidence for the argument that differences in the foregoing 
proxies between Big 4 and non-Big 4 clients largely reflect client size. Using our full samples, we 
control for an extensive list of firm characteristics known to influence audit quality and find that 
the Big 4 effect is generally insignificant, indirectly supporting the argument that the Big 4 
distinction may reflect client and not auditor characteristics. 

Our results must be interpreted with due regard to their limitations and to the caveats of 
matching models, discussed in more detail throughout the article. This study has not resolved the 
underlying question as to whether differences in audit-quality proxies between Big 4 and non-Big 
4 auditors can be attributed to client characteristics, but we hope that it encourages other research- 
ers to explore alternative methodologies that further disentangle client characteristics from audit- 
quality effects. 


П. RELATED STUDIES AND MEASURES OF AUDIT QUALITY 

As the main observable outcome of an audit is the standardized audit report, researchers have 
used various proxies in an attempt to assess audit quality and, in turn, determine whether a 
differential in audit quality exists. Ап extensive branch of audit differentiation research focuses on 
the quality of the client's financial statements, in which discretionary accruals are often used as a 
proxy for audit quality, as they reflect the auditor's constraint over management's reporting deci- 
sions. Becker et al. (1998) find that Big 4 clients report lower absolute discretionary accruals than 
non-Big 4 clients. Francis et al. (1999) suggest that Big 4 auditors constrain opportunistic and 
aggressive reporting because their clients have higher total accruals but lower discretionary ac- 
cruals. Krishnan (2003) finds a greater association between discretionary accruals and future 
earnings for Big 4 than for non-Big 4 clients. Following this literature, we use discretionary 
accruals as our first measure of audit quality. The benefit of this measure is that it reflects the 
auditor's enforcement of accounting standards. However, a weakness is that it only partially 
captures the effectiveness of an audit in constraining earnings management, as discretionary ac- 
cruals not only reflect management's opportunism, but also management's signaling attempts and 
random noise, as noted by Guay et al. (1996). 

The valuation literature suggests that Big 4 auditors provide more assurance to the market 
than non-Big 4 auditors. The underlying intuition for using valuation proxies as a measure of audit 
quality is due to the fact that if market participants perceive that the Big 4 clients have more 
credible earnings than those of the non-Big 4 clients then, ceteris paribus, the Big 4 clients should 
receive a break in their cost-of-equity capital. For example, Khurana and Raman (2004) find that 
Big 4 clients have a lower ex ante cost of capital than non-Big 4 clients in the U.S.; however, they 
do not find such a difference in Australia, Canada, or Great Britain. Following Khurana and 
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Raman (2004), we use the ex ante cost-of-equity capital as our second audit-quality proxy to 
capture the capital market's perception of the financial reporting credibility of Big 4 and non-Big 
4 clients. | 

More recently, Behn et al. (2008) include analyst forecast accuracy as an audit-quality proxy. 
They argue that if one type of auditor increases the reporting reliability of earnings in comparison 
to the other type, then, ceteris paribus, analysts of the superior type's clients should be able to 
make more accurate forecasts of future earnings than those analysts of the non-superior type's 
clients. Using this reasoning, Behn et al. (2008) find that analysts of Biz 4 clients have higher 
forecast accuracy than analysts of non-Big 4 clients. We use analyst forecast accuracy as our third 
audit-quality measure to proxy for an enhanced level of decision making bv sophisticated financial 
statement users. 

In summary, we employ three audit-quality proxies—discretionary accruals, the ex ante cost- 
of-equity capital, and analyst forecast accuracy—to capture various aspects of audit quality in 
comparing Big 4 to non-Big 4 firms.’ 


Ш. MATCHING PROCEDURES AND TESTS OF NONLINEARITY 

We use propensity-score matching models, developed by Rosenbaum and Rubin (1983), to 
match on a broad range of client characteristics and use attribute-based matching to examine 
whether the Big 4 distinction can be attributed to specific client characteristics. Propensity-score 
matching models match observations based on the probability of undergoing the treatment, which 
in our case is the probability of selecting a Big 4 auditor. Matching models have important features 
appropriate for our setting. First, these models generate samples in which -he clients of Big 4 and 
non-Big 4 auditors are similar, providing a natural framework to parse out the effects of auditor 
and client characteristics on the audit-quality proxy. Second, previous studies examining the ex- 
istence of a differential in audit quality typically use Heckman (1979) selection models that rely 
on a specific functional form to provide an indirect estimate of auditor treatment effects. Matching 
models do not rely on a specific functional form and provide a more direct estimate of the 
treatment effects (Li and Prabhala 2007). Moreover, selection models pertaining to auditor choice 
could estimate treatment effects with error by failing to meet the exclusion restrictions because it 
is difficult to identify variables that influence the auditor selection (first-stage regression) and not 
the audit-quality proxy (second-stage regression). Last, matching models mitigate the potential 
impact of nonlinearities in estimating the treatment effects when the underlying functional form is 
nonlinear. А recent example of a study using a propensity-score matching model in the accounting 
literature is Armstrong et al. (2010), who find that the previously documented relation between 
equity-based compensation and accounting irregularities does not hold using propensity-score 
matching models. 

Notwithstanding their benefits, matching models have caveats. First, n-atching models rely on 
the assumption that the effects of unobservables are not pertinent to estimating treatment effects. 
Second, matching results in using subsamples of the population, and hence, as noted by Cram et 
al. (2009), matching reflects a trade-off between identifying the treatment effects and generalizing 
the results to the full population. Third, matching results in a different auditor composition in the 


* Other branches of the auditor differentiation literature use litigation, going concern, anc fraud frequencies as audit- 
quality proxies. These branches suggest that Big 4 firms are sued and sanctioned less (Palmcose 1988; Feroz et al. 1991), 
have lower thresholds for and greater accuracy in issuing opinions (Francis and Krishnan 1999; Lennox 1999; Weber 
and Willenborg 2003; Geiger and Rama 2006), and have a lower incidence of fraud among their clients than non-Big 4 
firms (Farber 2005). We do not use these proxies because the infrequent occurrence of -hese events results in small 
sample sizes with low power, potentially biasing in favor of our inferences. For example, Dechow et al. (1996) and 
Beneish (1999), using small samples, do not find a significant difference between the fraud incidence of Big 4 and 
non-Big 4 clients. 
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matched samples than in the full population. These features of matching models could bias our 
findings due to the reduced power of the tests or due to systematic differences in the subsamples 
from the full population. Fourth, the empirical inability to match on pre-treatment attributes and 
control for alternative treatments could result in a bias if the matching variables are affected by the 
auditor choice. Despite our attempts, we cannot fully rule out concerns that our inferences are 
affected by ex post matching or alternative treatments. In the "Sensitivity Analyses" section, we 
document a series of robustness tests in attempt to mitigate some of these concerns. 

We use а logit model to estimate the probability of selecting a Big 4 auditor, as it is the most 
prevailing approach for estimating propensity scores (Guo and Fraser 2010, 135). Given that 
matching models do not require exclusion restrictions, the general rule is to include a comprehen- 
sive list of attributes when estimating the propensity score (Li and Prabhala 2007). We estimate 
the propensity score for each audit-quality proxy analysis, including variables from the selection 
model used in Chaney et al. (2004) and the respective audit quality regression, as follows: 


BIG4, = By + B,LOG_ASSETS, , BjATURN;, + ВУСОКЕ + ВЕУ, + BsROA;, 
+ XPROXY. CONTROLS, , + Industry ЕЕ + Year_FE + &;,, (1) 


where for firm i and fiscal year t: 


BIG4 = 1 if the client has a Big 4 auditor in year, and 0 otherwise; 
LOG. ASSETS = natural logarithm of total assets at the end of year,; 
ATURN = sales,/total assets, 1; 
CURR = current assets,/current liabilities; 
LEV = (long term debt, plus debt in current liabilities,)/average total assets, 1; 
ROA = net income/average total assets, 1; and, 
PROXY_CONTROLS= control variables used and described in the respective audit-quality 
analysis. 


We estimate the propensity-score model by including all audit proxy control variables for the 
respective analysis, but we exclude variables from the regression if they are redundant to the first 
five variables in predicting auditor choice. We then match, without replacement, a non-Big 4 audit 
client with a Big 4 audit client that has the closest predicted value from Equation (1) within a 
maximum distance of 3 percent." Using this caliper distance, we match approximately 80 percent, 
99 percent, and 97 percent of non-Big 4 audit clients to Big 4 audit clients for the discretionary 
accruals, the ex ante cost-of-equity capital, and the analyst forecast-accuracy samples, respec- 
tively. In effect, this procedure creates a pseudo "random" sample in which the auditor type is 
randomly allocated to both the treatment and control groups (Heckman and Navarro-Lozano 
2004), such that any resulting differences between the two groups should reflect the treatment 


АП findings documented in this study are robust to using a probit model instead of a logit model to calculate propensity 
scores. In addition, we find that the logit models result in more balanced matched samples than using the probit models. 
We also estimate and match the propensity scores within each industry and year rather than using indicator variables, We 
find that all results using this alternative specification are similar to the tabulated results. 

Alternatively, we estimate the propensity-score model by including all the variables in the model as specified above even 
when a proxy control variable is similar to a variable in the auditor selection model. We find that the results are robust 
to including all the redundant variables simultaneously or to including only one redundant variable at a time. 
Inferences are the same whether we match with or without replacement. A caliper is the distance between the predicted 
probabilities of choosing the treatment between matched observations (Dehejia and Wahba 2002). 
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effect and not pre-existing client characteristics (Heckman et al. 1997, 1998; Dehejia and Wahba 
1999, 2002). Hence, differences in means between the treatment and control groups should be 
sufficient to estimate the treatment effects (Dehejia and Wahba 2002; Dehejia 2005). Nonetheless, 
we also use multivariate analyses to further control for any rémaining characteristic imbalances 
between the two client groups as well as general cross-sectional characteristic variations. 

We employ the Ramsey Regression Equation Specification Error Test (Ramsey 1969; hereaf- 
ter, RESET test), to identify specific client characteristics that are nonlinear to both the auditor 
choice and the audit-quality proxy? For each propensity-score regression and each audit-quality 
regression, we add squared and cubed versions of all control variables, as described in Sections 
V-VII. Using an F-test, we test the null hypothesis that all of the nonlinear terms of each control 
variable equal zero. If significant nonlinearities exist in the data and the Big 4 partitioning variable 
is correlated with these nonlinear variables, then these conditions could potentially confound the 
inferences of a standard OLS regression. For the attribute-based matching, we create single- 
variable matched samples in each analysis, using the propensity-score methodology-controlling for 
year and industry effects, based on those variables that are nonlinear to both the auditor choice and 
the audit-quality proxy. 


IV. SAMPLE SELECTION 

. For our analyses, we use firm-year data from 1988 to 2006. We restrict our analyses to this 
time period because reported: operating cash flows as per SFAS No. 95 (FASB 1987) are only 
available starting from 1988. For our discretionary accruals sample, we use data from Compustat. 
After deleting firms in the financial services industries (SIC codes between 6000—6999) and after 
imposing all the necessary requirements to calculate the discretionary accruals regression vari- 
ables, we obtain a sample of 72,600 firm-year observations, in which 59,323 and 13,277 are Big 
4 and non-Big 4 clients, respectively.” Our ex ante cost-of-equity sample reflects the intersection 
of Compustat, I/B/E/S Summary, and CRSP data. After imposing the necessary requirements to 
calculate the ex ante cost-of-capital regression variables, we obtain a sample of 25,068 firm-year 
observations, in which 23,856 and 1,212 reflect Big 4 and non-Big 4 clients, respectively,!° 
Similar to the ex ante cost-of-equity sample, the forecast-accuracy sample -eflects the intersection 
of Compustat, I/B/E/S, and CRSP data; however, the accuracy sample requires individual analyst 
forecast data from the I/B/E/S Detail files, whereas the ex ante cost-of-equity sample requires 
consensus forecast data from the I/B/E/S Summary files. After imposing the necessary require- 
ments to calculate the forecast-accuracy regression variables, we obtain a sample of 28,037 firm- 
year observations, in which 26,521 and 1,516 reflect Big 4 and non-Big 4 clients, respectively. 


As in predicting the propensity scores, we exclude variables from the RESET analysis if the two regressions contain 
redundant variables. The comparison of the RESET findings across the three audit-quality proxies is inherently limited 
given that we adopt the audit-quality regression models for each proxy from previous studies; thus, there are limited 
control variables that are common to all three analyses. 

We delete firms with negative total assets, sales, debt, and market values of equity, as such observations introduce noise 
into the discretionary accruals regressions; moreover, raw variables from Compustat are winsorized at the 1 percent and 
99 percent levels. We find that inferences for our three analyses hold without winsorizing except for those pertaining to 
the propensity-score matched sample in the analyst forecast accuracy analysis, as significent outliers reduce the effec- 
tiveness of the multivariate propensity-score matching. 

10 Бог the ex ante cost-of-equity, and forecast-accuracy samples, we delete firms іп the financial services industries (SIC 
codes between 6000-6999) and all variables are winsorized at the 1 percent and 99 percent levels. 
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V. ANALYSIS 1: DISCRETIONARY ACCRUALS 
Method 


We measure performance-adjusted discretionary accruals using the modified Jones model as 
recommended by Kothari et al. (2005; hereafter KLW)."' To test whether the Big 4 versus non-Big 
4 discretionary accruals differences could be attributed to client characteristics, we use the fol- 
lowing model: 


АРА, ‚= Po + В,ВІС4;, + BLOG. MKT;i, + BROA; + B4LEV; + BSCURR; 1 
+ Industry. FE + Year. FE £j, (2) 


where for firm i and fiscal year t: 


ADA = absolute discretionary accruals as per КІМ in year; 
BIG4 = | if the client has a Big 4 auditor in year,, and 0 otherwise; 
LOG МКТ = natural logarithm of market value of equity at the end of year, 
ROA = income before extraordinary items, ;/average total assets, 1; 
LEV = (long-term debt, , plus debt in current liabilities, ;)/average total assets, 1; and 
CURR = current assets, ;/current liabilities, 1. 


BIG4 is our main variable of interest and is used in a manner consistent with prior research. We 
include the variable LOG МКТ to control for client size. Following Butler et al. (2004), we 
include ROA, as the KLW procedure does not eliminate all the variation in accruals due to 
firm-specific performance. Also in line with Butler et al. (2004), we include LEV and CURR to 
control for the impact of financial risk on discretionary accruals. 


Results 


Table 1 presents the descriptive statistics for both the full and propensity-score matched 
samples. There are 72,600 firm-year observations in the full sample, of which 59,323 (77.6 
percent) and 13,277 (22.4 percent) reflect Big 4 and non-Big 4 clients, respectively. The descrip- 
tive statistics for the full sample indicate that Big 4 and non-Big 4 auditors have significantly 
different clienteles. Big 4 clients are approximately ten times the size of non-Big 4 clients: the 
mean market value and total assets of Big 4 clients are $2.2 billion and $2.0 billion, respectively, 
whereas the mean market value and total assets of non-Big 4 auditors are $234 million and $258 
million, respectively. Moreover, Big 4 clients are more profitable and leveraged than are non-Big 
4 clients, and have significantly less discretionary accruals and current assets than non-Big 4 
clients. 

Using Equation (1) to calculate the propensity scores and imposing a caliper distance of 3 
percent, we obtain a propensity-score matched sample of 21,130 firm-years, of which 10,565 are 
Big 4 clients and 10,565 are non-Big 4 clients. The propensity-score model appears effective in 
forming a balanced sample of Big 4 and non-Big 4 clients, as all accrual control variables in the 


П The КІМ model is as follows and is estimated by year and by two-digit SIC code, scaling by average lagged assets: 
TOTAL, ACCRUALS;, = By + B.(VASSETS, ) + B(ASALES, ,— AREC; ) + BSPPE;,* в, 


where for firm i and fiscal year t, TOTAL ACCRUALS equals (net income before extraordinary items minus operating cash 
flows from continuing operations,)/average total assets,.,; ASSETS equals average total assets,.;; AREC equals (change in 
accounts receivable from уеаг, to year,)/average total assets, у; PPE equals net property, plant, and equipment іп 
year/average total assets, 1; and в equals the рге-КІЛУ estimated discretionary accrual. From each firm's estimated 
pre-KLW discretionary accrual, we subtract the estimated pre-KLW discretionary accrual of the closest ROA firm in the 
same two-digit SIC code. The resulting normalized error term is the KLW discretionary accrual measure. 
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TABLE 1 


Discretionary Accruals Analysis: Descriptive Statistics 
Full and Propensity-Score Matched Samples 


Full Propensity-Score Matched Sample: 
Sample Matched Using the Full Model 
AllObs.  Big4  Non-Big4 Difference Big4 Non-Big4 Difference 
Mean Mean Mean in Means Mean Mean in Means. 
Std. Dev. Std. Dev. Std. Dev. (t-statistic) Std. Dev. Std. Dev. (t-statistic) 
ADA 0.1031 0.0941 0.1430 -0.0488%%% 0,1278 0.1312 -0.0034 
0.1398 0.1193 0.2033 (—36.76) 0.1582 0.1918 (—1.42) 
Total Assets 1,698 2,021 258 1,763*** 333 321 11.77 
5,333 5,789 1,794 (34.71) 2,042 2,006 (0.42) 
Total MKT Value 1,872 2,239 234 2,004*** 347 289 57.5] ** 
7,334 8,028 1,694 (28.63) 2,158 1,895 (2.06) 
LOG MKT 49784 5411 3.0450 — 2.3660 *** 3.5247 3.3265 0.1982*** 
2.361 2.2334 1.9013 (110.00) 1.9161 1.8949 (7.56) 
ROA —0.0273 —0.0106 -01018 0.0912%%% — —0,0822 —0.0843 0.0021 
0.2557 0.2206 0.3652 (37.50) 0.2990 0.3411 (0.49) 
LEV 0.2159 0.2207 0.1948 0.0261%%% 0,994 0.1983 0.0011 
0.1979 0.1967 0.2016 (13.62) 0.2074 0.1988 (0.41) 
CURR 3.1482 3.0324 3.6661 —0.6336*** 3.5748 3,6350 —0.0602 
5.5247 5.0077 7.3841 (—11.96) 5.5612 7.6348 (—0.66) 
No. Obs. 72,600 59,323 13,277 10,565 10,565 
% of Total 10095 71.696 22.496 


ж жж EE Indicate significance at the 0.10, 0.05 and 0.01 levels, respectively, using two-tailed t-tests of differences in 
means. 

This table presents the descriptive statistics for our full and propensity-score matched discretionary accruals samples. 
Propensity scores were calculated using Equation (1). 


Variable Definitions: 
ADA = absolute discretionary accruals as per Kothari et al. (2005) in year; 
Total Assets = total assets at the end of year; 
Total MKT Value = market value of equity at the end of year; 
LOG_MKT = natural logarithm of market value of equity at the end of year,; 
ROA = income before extraordinary items,_,/average total аѕѕеїѕ, 1; 
LEV = (long-term debt,_, plus debt in current liabilities,_,)/average total assets,_,; and 
CURR = current assets,_,/current liabilities, 1. 


propensity-score matched sample, except LOG_MRT, are insignificantly different at the 10 percent 
level between the two client types. LOG_MKT is still significantly different between the two client 
groups because the propensity-score model uses total assets rather than market value to proxy for 
firm size. When we use market value as the proxy for firm size in the propensity-score model, 
LOG. MKT is insignificantly different between the two client groups. 

In Table 2, we confirm the results of Becker et al. (1998), Francis et al. (1999), and Butler et 
al. (2004) for the full sample, as we find a negative and significant difference in means for BIG4 
of —0.0488 (t — —36.76; p « 0.01), and a negative and significant BIG4 coefficient of —0.0179 
(t = —8.48; p < 0.01) in the two Full Sample columns, respectively. АП control variable coeffi- 
cients are significant (p « 0.01) and have directional effects consistent with those documented bv 
the previous studies. Regressions in all analyses include untabulated year and industry fixed 
effects. . 
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TABLE 2 
Discretionary Accruals Analysis: Univariate and Multivariate Tests 
Full and Propensity-Score Matched Samples 
ADA; = By + B,BIG4;, + BLOG. MKT;,* B3ROA;, 1% B4LEV;,., + BSCURR;, 4 
+ Industry FE + Year. FE + €;, (2) 
Dependent Variable: ADA 
Propensity-Score Matched 
Full Sample: 
Sample Matched Using the Full Model 
Difference Multivariate Difference Multivariate 
Predicted in Means Estimate in Means Estimate 
Sign (t-statistic) (t-statistic) (t-statistic) (t-statistic) 
Intercept 0.1128*** 0.0828*** 
(13.43) (6.34) 
BIG4 - —0.0488*** — —0,0179*** — —0.0034 —0.0018 
(—36.76) (—8.48) (—1.42) (—0.73) 
LOG MKT - —0.0059*** —0.0054*** 
(—18.93) (—7.99) 
ROA - —0.1290*** —0.1319*** 
(—18.89) (—15.03) 
LEV - —0.0497*** —0.0570*** 
(— 17.08) (—9.42) 
CURR - —0.0003*** —0.0006*** 
(—2.60) (-3.43) 
Industry FE Included Included 
Year FE Included Included 
Adjusted R? 0.14 0.11 
No. Obs. 72,600 72,600 21,130 21,130 
Matching Model R? 0.29 
% Correctly Classified 86.03% 


ж Жж ЖЖЖ Indicate significance at the 0.10, 0.05 and 0.01 levels, respectively, using two-tailed tests. 

This table presents the results of our discretionary accruals tests using the full and propensity-score matched samples. 
Multivariate estimates are based on Equation (2). Propensity scores were calculated using Equation (1). t-statistics and 
p-values are calculated using clustered standard errors by firm for the multivariate analyses. For brevity, the year-specific 
and industry-specific intercepts are not reported. The matching model R? is the pseudo R? for the propensity-score logistic 
regression. The percentage correctly classified refers to the percentage of audit clients that are correctly classified as Big 4 
or non-Big 4 clients, based on a 50 percent cutoff level, using the predicted probabilities from the propensity-score model. 


Variable Definitions: 
ADA = absolute discretionary accruals as per Kothari et al. (2005) in year,; 
BIG4 = 1 if the client has a Big 4 auditor in year,, and 0 otherwise; 
LOG_MKT = natural logarithm of market value of equity at the end of year,; 
ROA = income before extraordinary items, , / average total assets,_;; 
LEV = (long-term debt, | plus debt in current liabilities, ;)/average total assets, 1; and 
CURR = current assets, , / current liabilites, 1. 
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The last two columns of Table 2 present the results of the propensity-score matched sample. 
We find an insignificant difference in means for BIG4 of —0.0034 (t = —1.42; р = 0.15) and an 
insignificant multivariate BIG4 coefficient of —0.0018 (t = —0.73; р = 0.46), suggesting that 
once client characteristics are balanced between the two clienteles, the treatment effects of Big 4 
auditors are insignificantly different from those of non-Big 4 auditors with respect to discretionary 
accruals. All control variable coefficients are significant (p « 0.01) and have directional effects 
consistent with those documented in previous studies. 2 

To examine whether the Big 4 effect can be attributed to specific client characteristics, we 
perform the following analysis. Using the RESET test, we find that total assets (LOG. ASSETS), 
return on assets (КОД, 1), total leverage (LEV, |), and the current ratio (CURR,_,) are nonlinear 
to both the auditor choice and discretionary accruals; hence, we create single-variable matched 
samples, with year and industry indicators, using each of these foregoing variables. Table 3 reports 
the discretionary accruals regressions for these matched samples, where columns опе and two 
report results for the LOG. ASSETS matched sample, columns three and faur report the results for 
the КОА, | matched sample, columns five and six report the results for the LEV, | matched 
sample, and columns seven and eight report the results for ће CURR,.. matched sample. The 
LOG. ASSETS matched sample has statistically insignificant BIG4 coefficients of —0.0030 (t = 
—1.21; p = 0.23) and —0.0026 (t = —1.03; p = 0.30) for the difference in means and multivariate 
results, respectively, consistent with the results using the full propensity-score matched sample. 
However, the coefficients on BIG4 are negative and significant (p « 0.01) for all other single- 
variable matched samples. 

It is not surprising that only the LOG. ASSETS matched sample yields the same inferences as 
the full propensity-score matched sample, given that client size is the characteristic most closely 
related to auditor selection. For example, in Table 2, the full propensity-score matching model 
explains 29 percent of the variation in auditor choice (see the matching model R?); whereas in 
Table 3, the LOG. ASSETS matching model (predicting auditor choice witk only total assets, year, 
and industry) explains 28 percent of the variation in auditor choice. The other single-variable 
matched models at best only explain 6 percent of the variation in auditor choice. Taken together, 
the findings from our matched samples suggest that the Big 4 effect, using discretionary accruals 
as an audit-quality proxy, appears to be attributable to client characteristics, primarily client size. 


VI. ANALYSIS 2: EX ANTE COST-OF-EQUITY CAPITAL 
Method 


Following Khurana and Raman (2004; hereafter KR), we examine the relation between au- 
ditor type and the ex ante cost of capital, as follows: 


КРЕС ‚= By + B,BIG4;, + BjBETA;, BLOG. LEV;,4 B,VAR;,4- BLOG. MKT;, 
+ BgLOG. ВТМ; + BjGRWTH,, + Industry ЕЕ + Year. FE + &;,, (3) 


12 Hribar and Nichols (2007) document that measurement error in absolute discretionary accruals is often correlated with 
the partitioning variable in tests of earnings management. They specifically identify that the BJG4 partitioning variable 
is highly correlated with the following two main drivers of measurement error in absolute discretionary accruals: firm 
size and operating volatility (Hribar and Nichols 2007, 1033). Following their guidance to correct for measurement error 
in estimating the partitioning variable, we simultaneously include several firm-size (the logarithm of assets, sales, and 
market value) and operating-volatility (cash-flow and sales volatility) controls in our discretionary accruals regressions 
(untabulated), and find similar results in the full and matched samples to those documented in Tables 2 and 3. Moreover, 
we ran our analyses separately for positive and negative discretionary accruals and again, all inferences are the same as 
those documented in our main analyses. 
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where for firm i and fiscal year t: 


КРЕС = ex ante cost-of-equity capital estimated using the r,,, approach as per Easton 
(2004) and used by KR; 
BIG4 = 1 if the client has a Big 4 auditor in year, and 0 otherwise; 
BETA = stock beta (systematic risk) calculated over the 36 months ending in the month 
of the fiscal year-end,; 
LOG. LEV = natural logarithm of (long-term debt, plus debt in current liabilities,)/average total 
assets, 1; 
VAR = earnings variability measured by the dispersion in analysts' earnings forecasts 
available on VB/E/S during the fiscal year-end month; 
LOG. MKT = natural logarithm of market value of equity at the end of year, 
- LOG. BTM - natural logarithm of book value of equity/market value of equity at the end of 
year, and 
GRWTH - forecasted growth measured as the difference between the mean analysts' 
two-year- and one-year-ahead earnings forecasts scaled by the one-year-ahead 
earnings forecast. 


For comparative purposes, all variables and their predicted directions are as specified by KR. 


Results 


Table 4 presents descriptive statistics for both the full and propensity-score matched ex ante 
cost-of-equity capital, RPEG, samples. Due to restrictions concerning analyst forecast data, there 
are 25,068 firm-year observations in the full sample, of which 23,856 (94.9 percent) and 1,212 
(5.1 percent) represent Big 4 and non-Big 4 clients, respectively. On average, firms in this full 
sample are significantly larger across both auditor types than in the accruals full sample: the mean 
market value and total assets of Big 4 clients are $4.5 billion and $4.2 billion, respectively, and the 
mean market value and total assets of non-Big 4 clients are $1.5 billion and $1.9 billion, respec- 
tively, indicating that the size differential is smaller in this sample than in the accruals sample. 
Also worth noting in the RPEG full sample is that Big 4 clients, on average, have significantly 
lower RPEG, BETA, and LOG. BTM values than non-Big 4 clients. The propensity-score model 
once again appears effective in forming a balanced sample of Big 4 and non-Big 4 clients as all the 
RPEG control variables in the propensity-score matched sample are insignificantly different at the 
10 percent level between the two client types. 

In Table 5, we first confirm the results of KR for the full sample, as we find a negative and 
significant difference in means for BIG4 of —0.0072 (t = —4.99; p < 0.01), and a negative and 
significant BIG4 coefficient of —0.0037 (t = —2.14, p = 0.03) in the two Full Sample columns, 
respectively. All control variable coefficients are significant (p « 0.01) and have directional effects 
consistent with those documented by KR for their U.S. sample. The last two columns of Table 5 
present the results of the propensity-score matched sample. We find an insignificant difference in 
means for BIG4 of —0.0031 (t = —1.43; p = 0.15) and an insignificant BIG4 coefficient of 
—0.0021 (t = —1.01; р = 0.31). АП the control variable coefficients remain significant and have 
directional effects consistent with those reported by KR. 

Again, to investigate whether the Big 4 effect could reflect client characteristics, we perform 
the following procedures. Using the RESET test, we find that total assets (LOG. ASSETS), the 
stock beta (BETA), total leverage (LOG. LEV), and the book-to-market ratio (LOG BTM) are 
nonlinear to both the auditor choice and to the ex ante cost-of-equity capital. Hence, we create 
single-variable matched samples, with year and industry indicators, using each of these variables. 
Table 6 reports the RPEG regressions using these attribute-based matched samples, where col- 
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TABLE 4 


Ex Ante Cost-of-Equity Capital Analysis: Descriptive Statistics 
Full and Propensity-Score Matched Samples 


Full Propensity-Score Matched Sample: 
Sample Matched Using the Full Model 
АП Obs.  Big4  Non-Big4 Difference Big 4 Non-Big 4 Difference 
Mean Mean Mean in Means Mean Mean in Means 
Std. Dev. Std. Dev. Std. Dev. (t-statistic) Std. Dev. Std. Dev. (t-statistic) 
RPEG 0.1146 0.1142 0.1215  —0.0072*** — 0.1182 0.1214 —0.0031 
0.0492 0.0489 0.0537 (—4.99) 0.0530 0.0536 (-1.43) 
Total Assets 4,093 4,202 1,944 2,258 %%ж% 1,723 1,955 —231.51 
12,122 12,258 8,784 (6.33) 7,802 8,812 (—0.68) 
Total MKT Value 4,352 4,499 1,466 3,083 1,516 1,474 42.17 
12342 12,575 5,428 (8.36) 5,906 5,444 (0.18) 
LOG_MKT 6.7469 6,7938 5.8235 | 0.9702*** 548871 5.8257 0.0614 
1.7851 1.7837 1.5471 (18.59) 1.5574 1.5518 (0.97) 
ВЕТА 1.1029 1.0959 1.2403  —0.1443*** 110983 1.2441 —0.0461 
0.7566 0.7461 0.9292 (—6.48) 0.8072 0.9308 (—1.28) 
LOG. LEV —1.8453 —1.8283  —2.1791 0.3510*** — —22303 —2.1860 —0.0442 
1.3257 1.3060 1.6310 (9.01) 1.7432 1.6341 (—0.64) 
VAR 0.0687 0.0689 0.0658 0.0030 0.0640 0.0658 —0.0018 
0.1033 0.1037 0.0961 (1.01) 0.0101 0.0960 (-0.45) 
LOG BTM —0.8204 —0.8331  —0.7565 -0.0766%%% —0.7405 | —0.7591 0.0186 
0.6862 0.6883 0.6374 (-3.79) 0.6928 0.6383 (0.69) 
GRWTH 0.3314 0.3306 0.3476 —0.0171 0.4150 0.3479 0.0671 
1.5853 1.6051 1.1257 (-0.37) 2.0604 1.1289 (0.99) 
No. Obs. 25,068 23,856 1,212 1,204 1,204 
% of Total 100% 94.9% 5.1% 


ж жж ЖЖЖ Indicate significance at the 0.10, 0.05 and 0.01 levels, respectively, using two-tailed t-tests of differences in 
means. 


This table presents the descriptive statistics for our full and propensity-score matched ex ante cost-of-equity capital 
samples. Propensity scores were calculated using Equation (1). 


Variable Definitions: 
КРЕС = ex ante cost-of-equity capital estimated using the г... approach as per Easton (2004) and used by 
Khurana and Raman (2004); 
Total Assets = total assets at the end of year; 
Total MKT Value = market value of equity at the end of year,; 
LOG_MKT = natural logarithm of market value of equity at the end of year; 
BETA = stock beta (systematic risk) calculated over the 36 months ending in the month of the fiscal 
year-end,; 
LOG, LEV = natural logarithm of (long-term debt, plus debt in current liabilities,)/average total assets,_; 
VAR = earnings variability measured by the dispersion in analysts’ earnings forecasts available on I/B/E/S 
during the fiscal year-end month; 
LOG_BTM = natural logarithm of book value of equity/market value of equity at the end of year,; and 
GRWTH = forecasted growth measured as the difference between the mean analysts’ two-year- and 
one-year-ahead earnings forecasts scaled by the one-year-ahead earnings forecast. 
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TABLE 5 


Ex Ante Cost-of-Equity Capital Analysis: Univariate and Multivariate Tests 
Full and Propensity-Score Matched Samples 


RPEG,, = В % ВІВІ64;, + В-ВЕТА;, + ВМ.ОС_ГЕҮ;, + ВАУАК;; + В5106 MKT;, 
+ PcLOG_BIM,, + B,GRWTH,, + Industry. FE + Year. FE + єз, (3) 


Dependent Variable: RPEG 


Propensity-Score 
Full Matched Sample: 
Sample Matched Using the Full Model 
Difference Multivariate Difference Multivariate 
Predicted in Means Estimate in Means Estimate 
Sign (t-statistic) (t-statistic) (t-statistic) (t-statistic) 
Intercept 0,1558 %%% 0.1517*** 
(22.09) (6.15) 
BIG4 - —0.0072*** -0,0037%% —0.0031 -0.0021 
(—4.99) (—2.14) (—1.43) (—1.01) 
BETA + 0.0099*** 0.0091*** 
(16.51) (5.29) 
LOG LEV + 0.0029%%% 0.0028%%% 
(10.52) (4.40) 
VAR + 0.0232%%% 0.0313** 
(5.38) (2.14) 
LOG MKT - —0.0062*** —0.0093*** 
(—19.77) (—8.96) 
LOG BTM + 0.0105*** 0.0048** 
| (1371) (1.99) 
GRWTH + 0.0067%%% 0.0079*** 
(4.63) (5.04) 
Industry_FE Included Included 
Year_FE Included Included 
Adjusted R? 0.28 0.33 
No. Obs. 25,068 25,068 2,408 2,408 
Matching Model R? 0.11 
% Correctly Classified 95.00% 


ж Жж ЖЖЖ Indicate significance at the 0.10, 0.05 and 0.01 levels, respectively, using two-tailed tests. 

This table presents the results of our ex ante cost-of-equity capital tests using the full aad propensity-score matched 
samples. Multivariate estimates are based on Equation (3). Propensity scores were calculated using Equation (1). t-statistics 
and p-values are calculated using clustered standard errors by firm for the multivariate analyses. For brevity, the year- 
specific and industry-specific intercepts are not reported. The matching model R? is the pseuco R? for the propensity-score 
logistic regression. The percentage correctly classified refers to the percentage of audit clien:s that are correctly classified 
as Big 4 or non-Big 4 clients, based on a 50 percent cutoff level, using the predicted probabilities from the propensity-score 
model. 


Variable Definition: 


КРЕС = ex ante cost-of-equity capital estimated using the r,,, approach as per Easton (2004) and used by 
Khurana and Raman (2004); 
ВІСА = 1 if the client has a Big 4 auditor in year, and 0 otherwise; 


(continued on next page) 
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TABLE 5 (continued) 


BETA = stock beta (systematic risk) calculated over the 36 months ending in the month of the fiscal year-end,; 
LOG. LEV = natural logarithm of (long-term debt, plus debt in current liabilities,)/average total assets, 1; 
VAR = earnings variability measured by the dispersion in analysts’ earnings forecasts available on I/B/E/S 
during the fiscal year-end month; 
LOG. МКТ = natural logarithm of market value of equity at the end of year; 
LOG. ВТМ = natural logarithm of book value of equity/market value of equity at the end of year; and 
GRWTH = forecasted growth measured as the difference between the mean analysts' two-year- and one-year-ahead 
earnings forecasts scaled by the one-year-ahead earnings forecast. 


umns one and two report results for the LOG. ASSETS matched sample, columns three and four 
report the results for the BETA matched sample, columns five and six report the results for the 
LOG LEV matched sample, and columns seven and eight report the results for ће LOG. ВТМ 
matched sample. 

As іп the accruals analysis, the difference in means and the coefficient on 8104 for the 
multivariate results іп the LOG. ASSETS matched sample are insignificant: —0.0032 (t = —1.49; 
p = 0.14) and —0.0013 (t = —0.61; p > 0.50), respectively. The differences in means for BIG4 
are negative and significant, and the coefficients on 8104 in the multivariate results are insignifi- 
cant for all other single-variable matched samples. As in the discretionary accruals analysis, the 
LOG ASSETS matched sample yields the same inferences as the full propensity-score matched 
sample. Together, these findings suggest that the Big 4 effect on the ex ante cost-of-equity capital 
are likely attributable to client characteristics, primarily client size. 


VH. ANALYSIS 3: ANALYST FORECAST ACCURACY 
Method 


To test whether the Big 4 versus non-Big 4 differences in analyst forecast accuracy can be 
attributed to client characteristics, we use the following model employed by Behn et al. (2008; 
hereafter BCK): 


ACCY;, = Bo + B,BIG4;, + BLOG. MKT;,- B4SURPRISE, + BANETLOSS, ‚+ BsZMI;, 
+ BsSHORIZON, , + BjSTDROE, ‚+ ByNANA, ‚+ БЕГ, + Industry. FE + Year. FE 
+ 5; » (4) 


where for firm i and fiscal year t: 


ACCY = negative of the absolute value of the difference between analysts’ earnings 
forecasts of eps,,, and actual ерѕ,, у, scaled by stock price at the end of уеаг, as 
per Lang and Lundholm (1996) and used by BCK; 

BIG4 = 1 if the client has a Big 4 auditor in year,, and 0 otherwise; 
LOG. МКТ = natural logarithm of market value of equity at the end of уеаг,, у; 
SURPRISE = (net income,,, – net income,)/market value of equity at the end of year; 
NETLOSS = 1 if client has negative net income,,;, and 0 otherwise; 
ZMIJ == distress score calculated using Zmijewski's (1984) unweighted original parameters; 
HORIZON - natural logarithm of the average number of calendar days between forecast 
announcement date and subsequent earnings announcement date; 
STDROE = standard deviation of net income over the five years from year, 4 until year; 
МАМА = natural logarithm of the number of analysts following the client; and 
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EL = actual eps,,;. 


For comparative purposes, all variables and their predicted directions are as specified by BCK. 


Results 


Table 7 presents descriptive statistics for both the full and propensity-score matched analyst 
forecast accuracy, ACCY, samples. The full sample has a total of 28,037 firm-year observations, of 
which 26,521 (94.3 percent) and 1,516 (5.7 percent) represent Big 4 and non-Big 4 clients, 
respectively. As with the ex ante cost-of-equity capital analysis, the firms in this full sample are 
significantly larger across both client types than in the accruals full sample and, hence, the size 
differential between the two groups is again smaller in this sample (Бап it is in the accruals 
analysis. Overall, the descriptive statistics again confirm that Big 4 and non-Big 4 clienteles are 
significantly different in terms of size and profitability. Moreover, relevant to this analysis, we find 
that Big 4 audit clients have a larger analyst following and a lower probability of bankruptcy than 
non-Big 4 audit clients. 

In Table 8, we first confirm the results of BCK for the full sample, as we find a positive and 
significant difference in means for BIG4 of 0.0083 (t = 7.21; p < 0.01), and a positive and 
significant BIG4 coefficient of 0.0042 (t — 2.34; p — 0.02) in the two Full Sample columns, 
respectively. Except for HORIZON, NANA, and EL, all control variable coefficients have direc- 
tional effects and significance levels consistent with those documented by BCK. The last two 
columns of Table 8 present the results of the propensity-score matched sample. We find an 
insignificant difference in means for BIG4 of 0.0026 (t — 1.30; p — 0.19) and an insignificant 
BIG4 coefficient of 0.0031 (t = 1.54; p = 0.12). 

Using the RESET test, we find that the total assets (LOG. ASSETS), the standard deviation of 
the return on equity (STDROE), and analyst following (NANA) are nonlinear to both the auditor 
choice and to analyst forecast accuracy. Hence, we create single-variable matched samples, with 
year and industry indicators, using each of these variables. Table 9 reports the ACCY regressions 
for the attribute-based matched samples, where columns one and two report results for the LO- 
G. ASSETS matched sample, columns three and four report the results for the STDROE matched 
sample, and columns five and six report the results for the МАМА matched sample. 

As in the accruals and ex ante cost-of-equity capital analyses, the difference in means and the 
coefficients on 8104 for the multivariate results іп the LOG. ASSETS matched sample are insig- 
nificant: —0.0021 (t = —0.91; p = 0.36) and 0.0013 (t = 0.59; p > 0.50), respectively. For the 
STDROE matched sample, the difference in means for BIG4 is positive aad significant, while the 
BIG4 coefficient in the multivariate analyses is insignificant. Moreover, the difference in means 
and the coefficients on В164 іп the МАМА matched sample are positive anc significant. In line with 
the other two analyses, these findings suggest that client characteristics, and particularly client 
size, could potentially confound the inferences pertaining to the Big 4 efect. 

Our findings concerning analyst forecast accuracy appear to be consistent with a 2008 CFA 
Institute Center survey of 617 CFA investment analysts documenting that the majority of analysts 
do not prefer Big 4 auditors to non-Big 4 auditors. Specifically, only 41 percent of the respondents 
generally indicated that they had a preference for firms using “brand-name” auditors; moreover, 
only 15 percent of the respondents thought that the attractiveness of a company as an investment 
is detracted when a smaller company switches to a lower-cost auditor that may be more efficient 
and cost-effective (CFA Institute Center 2008). 


VIII. SENSITIVITY ANALYSES 
Bootstrapping, Kernel Weighting, and Random Subsamples 


To mitigate concerns that our findings are a consequence of the smaller sample sizes, we 


The Accounting Review January 2011 
American Accounting Association 


Big 4 versus Non-Big 4 Differences in Audit-Quality Proxies 277 


TABLE 7 


Analyst Forecast Accuracy Analysis: Descriptive Statistics 
Full and Propensity-Score Matched Samples 


Full Propensity-Score Matched Sample: 
Sample Matched Using the Full Model 
АП Obs. Бір4  Non-Big4 Difference Non-Big 4 Difference 
Mean Mean Mean in Means Big 4 Mean in Means 
Std. Dev. Std. Dev. Std. Dev. (t-statistic) Mean Std. Dev. Std. Dev. (t-statistic) 
ACCY —0.0122 —0.0117 -00201 0.0082%%% --0.0162 - 0.0189 0.0027 
0.0437 0.0425 0.0608 (7.21) 0.0511 0.0593 (1.30) 
Total Assets 4,243 4,377 1,893 2,484%% ж 1,382 1,929 -547%% 
у 12,787 12,990 8,141 (7.36) 5,527 8,245 (—2.12) 
Total MKT 4,875 5,058 1,677 3,281%%% 1,238 1,708 -470%% 
Үйіне 17331 17,29 6,756 (7.39) 4,367 6,839 (-2.22) 
LOG MKT 6.6934 6.7582 5.5594 1.1989*** 5.6056 5.5848 0.0208 
1.8677 1.8508 1.7975 (24.57) 1.6483 1.7950 (0.33) 
SURPRISE 0.0814 0.0812 0.0838 — 0.0026 0.0865 0.0821 0.0045 
0.1917 0.1909 0.2062 (—0.50) 0.2198 0.2040 (0.58) 
NETLOSS 0.2036 0.2029 0.2170 —0.0141* 0.2203 0.2108 0.0095 
0.4027 0.4021 0.4123 (—1.33) 0.4146 0.4080 (0.63) 
2м) —3.0792 —3.0710 3.2154 0.1441*** —3.1456 —3.2279 0.0822 
1.3155 1.3085 1.4266 (4.15) 1.4487 1.4024 (1.57) 
HORIZON 3.3829 3.3833 3.3765 0.0068 3.3892 3.3766 0.0126 
0.5051 0.5029 0.5426 (0.51) 0.5385 0.5372 (0.74) 
STDROE 0.3623 0.3486 0.6005 -0.2539%%% 0.5285 0.5580 —0.0295 
1.1282 1.0816 1.7360 (—8.53) 1.4723 1.6361 (—0.52) 
NANA 0.9265 0.9457 0,5905: 0.3552*** 0.5940 0,5985 -0.0045 
0.8588 0.8626 0.7106 (15.73) ' 0.6943 0.7137 (70.17) 
EL 0.9682 0.9798 0.7661 0.2137%** 0.7489 0.7862 —0.0372 
2.0088 2.0342 1.4804 (4.03) 1.5335 1.4806 (—0.66) 
No. Obs. 28,037 26,521 1,516 1,475 1,475 


% of Total 100% 94.3% 5.7% 


ж жж ЖЖЖ Indicate significance at the 0.10, 0.05 and 0.01 levels, respectively, using two-tailed t-tests of differences in 
means, 

This table presents the descriptive statistics for our full and propensity-score matched analyst forecast-accuracy samples. 
Propensity scores were calculated using Equation (1). 


Variable Definitions: 
ACCY = negative of the absolute value of the difference between analysts’ earnings forecasts of eps,,, and 
actual ерз,,1, scaled by stock price at the end of year, as per Lang and Lundholm (1996) and 
used by Behn et al. (2008); 
Total Assets = total assets at the end of yearn; 
Total MKT Value = market value of equity at the end of year,,); 
LOG_MKT = natural logarithm of market value of equity at the end of year,,;; 
SURPRISE = (net income,,, - net income,)/market value of equity at the end of year; 
NETLOSS = 1 if the client has negative net income,,;, and 0 otherwise; 
ZMIJ = distress score calculated using Zmijewski's (1984) unweighted original parameters; 
HORIZON = natural logarithm of the average number of calendar days between the forecast announcement date 
and the subsequent earnings announcement date; 
STDROE = standard deviation of net income over the five years from уеаг until year; 
EL = actual eps,,); and 
NANA = natural logarithm of the number of analysts following the client. 
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TABLE 8 


Analyst Forecast Accuracy Analysis: Univariate and Multivariate Tests 
Full and Propensity-Score Matched Samples 


АССУ, = By + В,В104,, BLOG. MKT,, + B,SURPRISE,, + BNETLOSS,, + ВУ, 
+ B;HORIZON,,  B,STDROE;, + «МАМА, + [SEL;, 
+ Industry FE + Year. FE + &j, (4) 


Dependent Variable: ACCY 
Propensity-Score Matched 


Full Sample: 
Sample Matched Using the Full Model 
Difference — Multivariate Difference Multivariate 
Predicted in Means Estimate in Means Estimate 
Sign (t-statistic) (t-statistic) (t-statistic) (t-statistic) 
Intercept -0.0414%%% —0.0302*** 
(—9.08) (-2.59) 
BIG4 t 0.0083*** — 0.0042%% 0.0026 0.0031 
(7.21) (2.34) (1.30) (1.54) 
LOG МКТ t 0.0027*** 0.0052*** 
(10.12) (4.36) 
SURPRISE - —0.0598*** —0.0439*** 
(—13.43) (—4.46) 
NETLOSS - —0.0072*** —0.0198*** 
(—6.87) . (—4.68) 
ZMIJ - —0.0028*** —0.0024* 
(—6.69) (—1.79) 
HORIZON — 0.0001 —0.0003 
(0.22) (—0.17) 
STDROE - —0.0021*** —0.0035*** 
(—4.07) (—2.66) 
NANA + 0.0016*** —0.0006 
(4.25) (—0.27) 
EL ? 0.0006 *** —0.0006 
(2.38) (—0.62) 
Industry FE Included Included 
Year. FE Included Included 
Adjusted R? 0.19 0.18 
No. Obs. 28,037 28,037 2,950 2,950 
Matching Model R? 0.14 
% Correctly Classified 94.62% 


ж ЖЖ ЖЖЖ Indicate significance at the 0.10, 0.05 and 0.01 levels, respectively, using two-tailed tests. 

This table presents the results of our analyst forecast accuracy tests using the full and propensity-score matched samples. 
Multivariate estimates are based on Equation (4). Propensity scores were calculated using Equation (1). 

Following Behn et al. (2008), t-statistics and p-values are calculated using White’s (1980) heteroscedasticity-corrected 
standard errors clustered by firm for the multivariate analyses. For brevity, the year-specific and industry-specific intercepts 


(continued on next page) 
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TABLE 8 (continued) 


are not reported. The matching model R? is the pseudo R? for the propensity-score logistic regression. The percentage 
correctly classified refers to the percentage of audit clients that are correctly classified as Big 4 or non-Big 4 clients, based 
on a 50 percent cutoff level, using the predicted probabilities from the propensity-score model. 


Variable Definitions: 
АССУ = negative of the absolute value of the difference between analysts’ earnings forecasts of eps, and 
actual ерз, 1, scaled by stock price at the end of year, as per Lang and Lundholm (1996) and 
used by Behn et al. (2008); 
BIG4 = 1 if the client has a Big 4 auditor in year,, and 0 otherwise; 
LOG. MKT = natural logarithm of market value of equity at the end of уеаг,, 1; 
SURPRISE = (net income,,, — net income,)/market value of equity at the end of year; 
NETLOSS = 1 if the client has negative net income,,;, and 0 otherwise; 
ZMIJ = distress score calculated using Zmijewski's (1984) unweighted original parameters; 
HORIZON - natural logarithm of the average number of calendar days between the forecast announcement date 
and the subsequent earnings announcement date; 
STDROE = standard deviation of net income over the five years from year, 4 until year,; 
EL = actual eps,,; and 
МАМА = natural logarithm of the number of analysts following the client. 


perform the following analyses. First, for each matched sample in our main analyses, we obtain 
bootstrap estimates, using 10,000 repetitions. Second, using each audit-quality proxy, we run 
analyses for the full sample, putting more weight and using kernel weighting (Heckman et al. 
1997), on the observations that are closer in the propensity score or client size. We find that all 
inferences from these tests are the same as those documented in our matched sample analyses. 
Moreover, to ensure that the Big 4 effect can still be detected in smaller subsamples, we 
perform the following analyses. First, for each audit-quality proxy, we obtain bootstrap estimates 
using 10,000 repetitions and sample sizes equal to those of the matched sample in each repetition. 
Second, for each audit-quality proxy, we initially select a random subsample with the same 
number of observations and composition (1.е., 50 percent Big 4 clients and 50 percent Non-Big 4 
clients) as in the matched sample. Using this random subsample, we obtain bootstrap estimates 
using 10,000 repetitions.'? We find that the Big 4 effect holds in these alternative analyses. 


Varying the Proportion of Big 4 and Non-Big 4 Clients in the Full and Matched Samples 


Some of our audit-quality proxies result in samples that have different auditor compositions 
than the full population, potentially biasing our inferences. Hence, for each audit-quality proxy, we 
select a random sample with an 80/20 ratio between Big and non-Big 4 auditors, as in the full 
Compustat population, from both our full and matched samples, and obtain bootstrap estimates 
using 10,000 repetitions.'* АЛ reported inferences are robust to these alternative specifications. 


Additional Client and Auditor Characteristics 


Facing a similar problem of separating the effects of client from those of audit firm charac- 
teristics, Francis and Yu (2009; hereafter FY) document that the audit quality of larger Big 4 


В For example, the discretionary accruals propensity-score matched sample has 10,565 clients of each auditor type; hence, 

we randomly select 10,565 of each auditor type from our full sample and estimate our regression Equation (2) for each 
random sample, repeating this procedure 10,000 times to obtain bootstrapped estimates. 
For example, from the discretionary accruals full sample, we randomly draw 40,000 observations from Big 4 clients and 
10,000 observations from non-Big 4 clients, achieving an 80/20 ratio. Similarly, from the discretionary accruals matched 
sample, we randomly draw 8,000 observations from Big 4 clients and 2,000 observations from non-Big 4 clients 
achieving an 80/20 ratio. Using these two subsamples, one derived from the full sample and the other from the matched 
sample, we re-estimate our regression models with 10,000 bootstrap repetitions. 
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TABLE 9 


Analyst Forecast Accuracy Analysis: Univariate and Multivariate Tests 
Attribute-Based Matched Samples 


АССУ, = By + BBIGA,, + BLOG. МКТ), + BSURPRISE;, + BNETLOSS;, 


+ Вѕ52МІЈ;, + ВсНОКІ20ОМ;, + B;STDROE;, + PsNANA;, T ЊЕГ; 
+ Industry FE + Year. FE + є;, (4) 


Dependent Variable: ACCY 


Size Matched Sample: STDROE Matched Sample: NANA Matched Sample: 
Matched Using LOG, ASSETS, Matched Using STDROE, Matched Using МАМА, 


Year and Industry Year and Industry Year and Industry 
Difference Multivariate Difference Multivariate Difference Multivariate 
in Means Estimate in Means Estimate in Means Estimate 
(t-statistic) (t-statistic) (t-statistic) (t-statistic) (t-statistic) (t-statistic) 
BIG4 —0.0021 0.0013 0.0078*** 0.0024 0.0105*** 0,0038*** 
(-0.91) (0.59) (4.08) (1.19) (6.13) (2.10) 
Adjusted R? 0.21 0.22 0.20 
No. Obs. 2,956 2,956 2,982 2,982 2,960 2,960 
Matching 0.13 0.05 0.06 
Model R? 
% Correctly 94.60% 94.55% 94.56% 
Classified 


ж Жж ЖЖЖ Indicate significance at the 0.10, 0.05 and 0.01 levels, respectively, using two-zailed tests. 


This table presents the results of our analyst forecast accuracy tests using three different attribute-based matched samples. 
We use the propensity-score methodology to implement the attribute-based matching. Propensity scores were calculated 
using only a single variable with year and industry indicators. 


Multivariate estimates are based on Equation (4). Following Behn et al. (2008), t-statistics and p-values are calculated 
using White’s (1980) heteroscedasticity-corrected standard errors clustered by firm for the multivariate analyses. For 
brevity, we only report the coefficients on BIG4. The matching model R? is the pseudo R? for the propensity-score logistic 
regression. The percentage correctly classified refers to the percentage of audit clients that are correctly classified as Big 4 
or non-Big 4 clients, based on a 50 percent cutoff level, using the predicted probabilities from the propensity-score model. 


Variable Definitions: 
ACCY = negative of the absolute value of the difference between analysts’ earnings forecasts of eps,,, and 
actual eps,,,, scaled by stock price at the end of year, as per Lang and Lundholm (1996) and 
used by Behn et al. (2008); 
BIG4 = 1 if the client has a Big 4 auditor in year, and 0 otherwise; 
LOG MKT = natural logarithm of market value of equity at the end of year,,.; 
SURPRISE = (net income, ~ net income,)/market value of equity at the end of year; 
NETLOSS = 1 if the client has negative net income, and 0 otherwise; 
ZMIJ = distress score calculated using Zmijewski's (1984) unweighted original parameters; 
HORIZON = natural logarithm of the average number of calendar days between the forecast announcement date 
and the subsequent earnings announcement date; 
STDROE = standard deviation of net income over the five years from year, 4 until year; 
EL = actual ерз,,1; and 
NANA - natural logarithm of the number of analysts following the client. 
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offices is superior to that of smaller Big 4 offices. We enhance our study's controls by estimating 
all our propensity-score and audit-quality regressions including FY’s comprehensive list of control 
variables." We find that the coefficients on BIG4 in the full and matched samples are insignificant 
for all three audit-quality proxies. These results (untabulated) provide additional support for the 
argument that the Big 4 distinction may reflect client and not auditor characteristics. 6 


Auditor Office Size and Client Characteristics 


FY’s tests comparing audit quality among large and small auditor offices may encounter 
similar difficulties as the Big 4 auditor tests, given that there is a strong positive correlation 
between office size and client size. To investigate this possibility, we first replicate and confirm 
FY’s study using their full absolute discretionary accruals, small profit increases, and going- 
concern opinions models, and their sample years. Second, we create an indicator variable, MED- 
OFFICE, equal to 1 for offices above the median office size, and 0 otherwise, in order to match 
clients of large offices to those of small offices. Third, we confirm FY’s findings by replacing 
LOGOFFICE with MEDOFFICE in the full samples for the three audit-quality proxies. Finally, 
we use both client-size and propensity-score matching to match clients of offices above the median 
office size to those clients of offices below the median office size. We then find an insignificant 
difference in MEDOFFICE between large and small offices for three foregoing audit-quality 
proxies (p > 0.50, p = 0.29, and p = 0.20, respectively). These matched-sample results are robust 
to 10,000 bootstrap repetitions. We caution that this analysis does not rule out FY’s conclusion 
that large offices have higher audit quality, although it suggests the importance of fully controlling 
for client characteristics in tests of audit quality. 


Nonlinear Robustness Checks 


To some degree, matching models purge the impact of nonlinearities on auditor size in our 
analyses. Nonetheless, as a robustness check, we use a backfitting partial linear model (Hastie and 
Tibshirani 1990) to estimate both the linear and nonlinear terms previously identified using the 
RESET test.!” We find that all inferences pertaining to the full and matched samples using partial 
linear regressions are the same as those documented in the main analyses using ordinary least- 
squares (OLS) regressions. 


15 These variables are as follows: LOGOFFICE is the natural logarithm of practice office size based on the aggregated 
client audit fees (in $ millions) of a practice office in a specific fiscal year; INFLUENCE is the ratio of a specific client’s 
total fees (audit fees plus nonaudit fees) relative to aggregate annual fees generated by the practice office that audits the 
client; TENURE is 1 if auditor tenure is three years or less, and 0 otherwise; NATIONAL_LEADER is 1 if an auditor is 
the number one auditor in an industry in terms of aggregated audit fees in a specific fiscal year, and 0 otherwise; 
CITY_LEADER is 1 if an office is the number one auditor in terms of aggregated client audit fees in an industry within 
that city in a specific fiscal year, and 0 otherwise; OPSEG is the number of operating segments reported in the 
Compustat segments database for the firm in year; GEOSEG is the number of geographic segments reported in the 
Compustat segments database for the firm in year; SALESGROWTH is the one-year growth rate of a firm's sales 
revenue; SALESVOLATILITY is the standard deviation of sales revenue, we use a rolling window and require a mini- 
mum of three years of data, CFOVOLATILITY is the standard deviation of CFO, we use a rolling window and require 
a minimum of three years of data to estimate; CFO is operating cash deflated by lagged total assets; LOSS is 1 if net 
income before extraordinary items is negative, and 0 otherwise; BANKRUPTCY is the Altman Z-score, which is a 
measure of the probability of bankruptcy, with a lower value indicating greater financial distress; VOLATILITY is a 
client’s stock volatility and is the standard deviation of 12 monthly stock returns for the current fiscal year; and MB is 
the natural logarithm of the ratio of a client’s market value of equity to its book value of equity in year,. 

When we include the extensive list of control variables as per FY, the Big 4 effect could disappear due to the 
multicollinearity of these auditor-related variables with the 8104 variable. For example, the correlation between BIG4 
and LOGOFFICE in our separate analyses is approximately 0.4. Thus, what we learn from the matched sample findings 
is that they are not likely caused by the multicollinearity of the additional auditor controls with the 8104 variable as the 
matched samples balance and, hence, mitigate these multicollinearity effects to some extent. 

17 А backfitting algorithm simultaneously estimates the effects of nonlinear terms and the coefficients for the linear terms. 

This algorithm is available in the R and S-Plus packages using the function gam (generalized additive model). 
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In addition to using the RESET test, we generate scatter plots with both linear and local 
polynomial fits of each nonlinear variable. These scatter plots support the RESET test conclusions 
concerning the documented nonlinearities in each audit-quality proxy sample. We also run a 
bootstrap analysis on the RESET analysis drawing 10,000 samples of 5,000 observations, and 
estimate the RESET tests using the bootstrapped estimates. We find that client size, measured as 
LOG. ASSETS or LOG. МКТ, is the only variable nonlinear to both the auditor choice and audit- 
quality proxy in each of the audit-quality analyses. However, there could be other forms of 
nonlinearity that are not detected by these tests. 


Heckman Self-Selection Model 


To examine whether the results are sensitive to specifications that consider the Heckman 
(1979) procedure, we model the self-selection of auditors following Chaney et al. (2004).18 We 
estimate the model separately for each of the three audit-quality proxy samples and include the 
inverse Mills' ratio, estimated by year, in the respective second-stage pooled regressions, in the 
full and matched samples. ”? In line with Francis et al.'s (2010) critique of the Heckman (1979) 
model pertaining to satisfying exclusion restrictions, we estimate the inverse Mills ratio control- 
ling for client size using total assets, total sales, market value, and various combinations of these 
variables. We continue to find evidence of the Big 4 effect in our full samples and do not find an 
effect for our matched samples using these alternative specifications. Current research is attempt- 
ing to extend the literature and resolve the differences in estimating treatment effects using either 
the Heckman (1979) or the propensity-score approaches (Guo and Fraser 2010). 


Audit Fees and Client Characteristics 


Several studies use audit fees as a proxy to assess the potential superiority of Big 4 auditors, 
generally finding that the audit fees of the Big 4 clients are significantly larger than those of the 
non-Big 4 clients 20 Among other factors, fees proxy for the level of service provided (e.g., Davis 
et al. 1993; Whisenant et al. 2003) and are negatively associated with levels of earnings manage- 
ment (e.g., Frankel al. 2002; Ashbaugh et al. 2003). However, the existence of a fee premium in 
itself does not imply higher audit quality, especially if the Big 4 auditors have more pricing power 
over their clients than do the non-Big 4 auditors (Francis 2004). Moreover, O’Keefe et al. (1994, 
242) caution that "inferences about prices in such studies can be erroneovs if the cross-sectional 
variations in auditor effort caused by differences in client characteristics ere not adequately con- 
trolled." 

As a sensitivity analysis, we apply our matching methodology to examine whether the Big 4 
premium may be attributed to client characteristics. Using an audit fee model similar to that 
employed by Chaney et al. (2004), we perform the same set of analyses (untabulated) as with the 
prior three audit-quality proxies, using Compustat fee data from 2000 to 2006. We find that the 


18 Following Chaney et al. (2004), we model the self-selection of auditors, estimated separately for each year, as follows 
in order to obtain the inverse Mills ratio: 


ВІСА, = By + BLOG. ASSETS, + BjATURN, + ByCURR, + ВРТА, + ВКОА, + BgROALOSS, + в, 


where BIG4 equals 1 if the client has a Big 4 Auditor, and 0 otherwise; LOG ASSETS equals the natural logarithm of 
total assets; ATURN equals sales/total assets; CURR equals current assets/current liabilities; РТА equals total 
debt/total assets; ROA equals net income before extraordinary items/average total assets,.;; and ROALOSS equals 
ROA multiplied by 1 if client has negative net income, and 0 otherwise. 

19 Other articles in this literature that apply the Heckman model to panel data in this manner include Khurana and Raman 
(2004), and Behn et al. (2008). 

20 Such studies include: Simunic (1980), Francis (1984), Francis and Stokes (1986), Chan et al. (1993), Craswell et al. 
(1995), DeFond et al. (2000), Ferguson et al. (2003), Whisenant et al. (2003), and Chaney et al. (2004). For an extensive 
review of this literature, see Hay et al. (2006). 
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coefficient on BIG4 is positive and statistically significant at the 5 percent level for the full, the 
propensity-score matched, and the attribute-based matched samples.^! Our results are consistent 
with the propensity-score matched fee results presented in Clatworthy et al. (2009), and could be 
a reflection of the following circumstances noted іп Simunic (1980) and O’Keefe et al. (1994): (1) 
that Big 4 auditors provide a higher quantity of audit services; (2) that Big 4 auditors charge a 
higher unit price; and (3) it is possible that we still do not adequately control for all the relevant 
client characteristics that drive the quantity of audit services rendered. Our analyses are robust to 
the inclusion of a comprehensive list of client and auditor characteristics as in Francis and Yu 
(2009), and to different matching specifications. 


IX. CONCLUSION 

In this study, we examine whether differences in quality between Big 4 and non-Big 4 audit 
firms could be a reflection of client characteristics. Using matching models or controlling for an 
extensive list of client and auditor variables, we find that the treatment effects of Big 4 auditors are 
insignificantly different from those of non-Big 4 auditors with respect to discretionary accruals, 
the ex ante cost-of-equity capital, and analyst forecast accuracy. 

We caution the reader that our findings must be interpreted with due regard to their method- 
ological limitations. First, an inherent limitation of this approach is that we are unable to match on 
pre-treatment attributes. Second, the propensity-score and attribute-based matching models rely on 
the assumption that the effects of unobservables are not pertinent to estimating the treatment 
effects. Third, matching reflects a trade-off between identifying the treatment effects and the 
ability to generalize results to the full population. Fourth, we cannot ensure that we include all 
relevant client and auditor control variables. We cannot rule out the possibility that the foregoing 
limitations could introduce biases into our analyses. 

We reemphasize that our study does not resolve the underlying question as to whether differ- 
ences in audit-quality proxies between Big 4 and non-Big 4 auditors can be attributed to client 
characteristics, but rather that it only provides suggestive evidence. We hope that our results 
encourage other researchers to explore alternative methodologies that further disentangle client 
characteristics from audit-quality effects. 
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plays in the disclosure of material weaknesses reported under Section 404 of the 
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examine the relation between material weakness (MW) disclosures and various IAF 
attributes and activities. Our results indicate that MW disclosures are negatively asso- 
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quality assurance techniques into fieldwork, audits activities related to financial report- 
ing, and monitors the remediation of previously identified control problems. The timing 
of Section 404 work and the nature of follow-up monitoring suggests that these aspects 
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I. INTRODUCTION 
his study investigates the role that a firm's internal audit function (IAF) plays in the 
| disclosure of material weaknesses reported under Section 404 of the Sarbanes-Oxley Act of 
2002 (U.S. Congress 2002). Management is ultimately responsible for establishing and 
maintaining adequate internal control over financial reporting and for evaluating the effectiveness 
of financial reporting controls and procedures. However, "support for management in the dis- 
charge of these responsibilities is a legitimate role for internal auditors," as long as it does not 
impair internal audit objectivity (Institute of Internal Auditors [IIA] 2004, 3). External auditing 
standards have long acknowledged internal auditing as a potentially valuable resource in the 
financial reporting process (SAS No. 65, AICPA 1991; AS No. 2, PCAOB 2004; AS No. 5, 
PCAOB 20072). Consistent with this reasoning, Bailey et al. (2003) find that companies view the 
IAF as a means of correcting perceived breakdowns in business reporting, internal control, and 
ethical behavior. 

Despite the IAF's duties surrounding internal control over financial reporting (ICFR), few 
researchers have empirically investigated the IAF's role in the financial reporting process (Gram- 
ling et al. 2004). A notable exception is a recent study by Prawitt et al. (2009), which provides 
evidence that the IAF can improve reporting quality by mitigating potential weaknesses in incen- 
tive system design. Our study complements theirs, in that we examine the association between the 
IAF and ICFR through prevention and detection of material weaknesses. Accordingly, our study 
helps fill an important gap in the literature regarding the influence of the IAF on the quality of the 
financial reporting process. 

Professional standards and prior research (AICPA 1991; IIA 2008; Prawitt et al. 2009) suggest 
that IAF quality encompasses specific attributes of the organizations and parties performing inter- 
nal audit activities (e.g., competence of IAF personnel), as well as the nature and scope of 
activities performed (e.g., the extent to which IAF monitors remediation of previously identified 
control problems). We investigate the relation between these factors and the likelihood that a firm 
reports a material weakness (MW). For a material weakness to be disclosed, it must exist and be 
detected. It is not clear ex ante how our ТАЕ quality measures will affect disclosures. For instance, 
more competent IAF personnel can help management establish stronger controls over financial 
reporting, and thus reduce the existence of control problems. Conversely, if a material weakness 
exists, more competent IAF personnel are more likely to detect it. We carefully consider the 
influence of various aspects of IAF quality on the existence, detection, and disclosure of MWs in 
developing our hypotheses. 

We conduct our tests using data on 214 firms that provided detailed responses to the ПА 
Global Auditing Information Network (GAIN) survey for 2003—2004. We identify 45 firms that 
disclosed at least one MW under SOX Section 404. Results indicate that the education level of 
IAF staff and the extent to which the IAF incorporates quality assurance techniques into fieldwork, 
audits activities related to financial reporting, and monitors the remediatior. of previously identi- 
fied control problems are negatively associated with MW disclosures. The timing of Section 404 
work and the nature of follow-up monitoring suggest that these aspects of IAF quality help prevent 
the existence of MWs. The IAF practices of grading audit engagements znd coordinating with 
external auditors are both positively associated with MW disclosures. The positive relations sug- 
gest that these activities increase the effectiveness of Section 404 compliance processes and 
thereby increase the likelihood that extant MWs will be detected and disclosed. Together, our 
results have important implications for managers responsible for determining IAF staffing and the 
structure of IAF activities, external auditors who perform Section 404 work, and standard-setters 
who provide Section 404 guidance. Moreover, our evidence that the ТАҒ affects the financial 
reporting process lends support to the requirement that NYSE-listed comparies maintain an inter- 
nal audit function (NYSE 2009). 
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This study makes several important contributions to the literature. First, we expand extant 
research on both internal auditing and ICFR by documenting associations between material weak- 
nesses and various measures of IAF quality. Second, we identify specific internal auditor practices 
and procedures that are associated with both the detection and prevention of material weaknesses. 
Few archival studies directly link auditor practices to the prevention or detection of audit excep- 
tions. Finally, we provide evidence that external auditors are more likely to detect material weak- 
nesses when they coordinate their efforts with the IAF. While a significant branch of the internal 
auditing literature focuses on the relation between external auditors and the IAF, no study of which 
we are aware has examined whether IAF involvement in external audits increases the effectiveness 
of external audits. 

Section II provides background information and reviews relevant literature, Section III de- 
scribes our hypotheses, and Section IV describes the sample and model. Results are discussed in 
Section V, with concluding remarks in Section VI. 


П. BACKGROUND AND LITERATURE REVIEW 
Material Weakness Reporting 


A material weakness is a deficiency, or combination of deficiencies, that result in a reasonable 
possibility that a company's controls will fai] to prevent or detect a material misstatement of an 
account balance or disclosure (AS Хо. 5, РСАОВ 2007аг).! Section 404 of SOX mandates that 
managers evaluate internal control over financial reporting (ICFR) and present the results of their 
evaluation in financial statements filed on Form 10-К and Form 10-Q. The regulation, which 
became effective for accelerated filers for year-end dates beginning November 15, 2004, also 
requires external auditors to annually assess and state an opinion on ICFR. Accordingly, both 
management and external auditors are responsible for ensuring that material weaknesses are de- 
tected and disclosed under Section 404. Prior to the implementation of Section 404, Section 302 
required management to evaluate and report on the effectiveness of ICFR; however, the external 
auditor was not required to opine on ICFR (U.S. Congress 2002). We exclude Section 302 material 
weaknesses from our analysis because these disclosures were subject to less regulation and al- 
lowed greater management discretion than were material weaknesses reported under SOX 404 
(Ashbaugh-Skaife et al. 2007; Hoitash et al, 2009). 

For a material weakness to be reported, it must exist, it must be detected, and it must be 
disclosed (Ashbaugh-Skaife et al. 2007). Figure 1 summarizes the sequence of Section 404 pro- 
cesses in a given year (Bedard and Graham 2011). Step 1 represents the firm's documentation and 
testing performed in support of management's evaluation of internal controls. Step 2 represents the 
external auditor's documentation and testing, which must follow the firm's work in a given area/ 
account. At the end of the year, internal control deficiencies (ICDs) that have not been remediated 
are classified by severity and aggregated to determine whether they constitute a material] weakness 
and, hence, must be publicly disclosed. The PCAOB and SEC direct external auditors and man- 
agers to evaluate the severity of each control deficiency to determine whether the deficiencies, 
individually or in combination, constitute a material weakness as of the date of management's 
assessment (AS No. 5, PCAOB 2007a; SEC 2007). Severity depends upon whether there is a 
reasonable possibility that the company's controls will fail to prevent or detect a misstatement and 
the magnitude of the potential misstatement. Multiple deficiencies that affect the same account or 


Although AS No. 2 was in effect during our sample period, we refer to AS No. 5, which supersedes AS No. 2, because 
we do not expect the differences in the two standards to affect our predictions regarding IAF quality (PCAOB 2004, 
20072). Two key differences are that AS No. 5 eliminates the redundant requirement that external auditors opine on both 
internal controls and оп management’s assessment of internal controls and replaces the words “more than remote" with 
the words "reasonable possibility" for defining what constitutes a material weakness. 
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FIGURE 1 
Sequence of Section 404 Compliance Processes 
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disclosure may, in combination, constitute a material weakness, even though individually the 
deficiencies are less severe. 

Gramling et al. (2004, 196) suggest that management's increased accountability for ICFR 
under SOX implies an expanded role for the IAF. Consistent with this reasaning, the IIA provides 
specific guidance regarding how the ТАЕ can support management's SOX compliance (ПА 2004). 
The ILA advocates that the IAF independently evaluate management's testing and assessment 
processes, and management’s basis for their assertions regarding the adequacy of internal controls. 
If control gaps are identified, then internal auditing should assess management's plans for correct- 
ing them and perform follow-up reviews. The IAF can also perform effectiveness testing for 
reliance by external auditors. Finally, the ПА advocates that the IAF act as coordinator between 
management and the external auditors and ensure that the results of ongoing internal audit activi- 
ties are disclosed. 


Prior Research on Auditor Quality 

Practitioners and academics alike generally contend that the effectiveness of internal controls 
is increasing in IAF quality. However, direct empirical evidence of this relation is limited (e.g., 
AICPA 1991; Kinney 2000; Bailey et al. 2003; Gramling et al. 2004; PCAOB 20072; Prawitt et al. 
2009).? In an experimental setting, Schneider and Wilner (1990) find that, under certain condi- 
tions, internal auditing deters financial reporting irregularities. In an archival study of Australian 
firms, Davidson et al. (2005) find no significant relation between the presence (versus absence) of 
an IAF and earnings management. Prawitt et al. (2009) investigate the relation between IAF 
quality and earnings management using the GAIN database. Guided by external auditing stan- 
dards, they develop a composite measure of IAF quality from proxies for IAF competence, IAF 


? See Gramling et al. (2004) for a review of the literature on internal auditing. 
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objectivity, the amount of financial reporting work the ТАЕ performs, and IAF size. They find 
evidence that IAF quality is associated with a lower level of earnings management. Their results 
suggest that higher-quality IAFs are more likely to deter managers from manipulating earnings 
and/or detect such manipulations and ensure that they are corrected prior to the issuance of 
financial statements. 

Two recent studies of ICDs identified under Sections 302 and 404 suggest that the quality of 
the auditors performing Section 302 and Section 404 work is positively associated with the 
detection and disclosure of control problems. Using proprietary data on small accelerated filers, 
Bedard and Graham (2011) find that firms that use outside consultants for Section 404 compliance 
report higher levels of ICD severity. They posit that outside consultants have greater expertise in 
control assessments that improves the effectiveness of Section 404 processes and increases the 
likelihood of ICD detection and the appropriateness of ICD classification. In a study of non- 
accelerated filers, Bedard et al. (2009) find that the disclosure of Section 302 ICDs increases with 
the experience of the external audit firm in conducting Section 404 work, suggesting that audit 
firms leverage the knowledge gained from SOX 404 audits in performing SOX 302 work. They 
also find that more Section 302 MWs are reported in the fourth quarter, when there is greater 
external auditor involvement relative to the first three quarters. They contend that external audit 
firms’ expertise, experience, and compliance processes for Section 302 are superior to that of 
client firms, and thus lead to increased detection and disclosure of ICDs at year-end.? While these 
studies provide useful archival evidence on the relation between auditor characteristics and the 
detection and disclosure of ICDs, neither directly addresses internal auditor quality. 

Research on the effects of third-party monitoring on management reporting is also relevant to 
internal auditing (Prawitt et al. 2009). Hoitash et al. (2009) find that greater accounting and 
financial supervisory experience on the audit committee is associated with a lower likelihood of 
Section 404 MW disclosures. Studies of management forecasts and communications show that 
management is less biased when their bias is likely to be perceived by others (Schwartz and Young 
2002; Rogers and Stocken 2005). Brown and Pinello (2007) provide evidence that the annual 
reporting process, which calls for an external audit, mitigates earnings management in year-end 
reports relative to that in interim reports. They attribute this result, at least in part, to increased 
scrutiny of year-end reports. Accordingly, if the internal auditor function is viewed as a third party 
that monitors managers' actions on a year-round basis, then improvements in IAF quality should 
strengthen deterrence and detection mechanisms. 


Ш. HYPOTHESIS DEVELOPMENT 

The overall relation between ТАҒ quality and MW disclosures depends on how various ТАЕ 
attributes and activities affect the existence, detection, and disclosure of MWs. If the effectiveness 
of ICFR is increasing in IAF quality, then IAF quality should be negatively associated with the 
existence of control problems and positively associated with both the detection and disclosure of 
extant control problems. Greater IAF quality deters managers from taking actions that compromise 
controls and encourages managers to put strong controls in place. Albrecht and Albrecht (2004) 
note that an effective control structure is probably the single most important step to eliminate (or 
minimize) the opportunity to commit fraud. A more capable IAF is also more likely to detect and 
correct minor control problems before they become severe enough to be considered material 
weaknesses. Decreasing the existence of control deficiencies unambiguously reduces the likeli- 
hood that a material weakness is reported. 


3 Competing disclosure incentives of management and the external auditor also explain the increased disclosure rate 
(Ashbaugh-Skaife et al. 2007; Asare et al. 2007). 
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If a material weakness already exists, then we expect that the improved detection capabilities 
of higher quality IAFs will increase the likelihood of disclosure. When a material weakness exists, 
the relation between IAF detection capability and disclosure depends upon (1) the likelihood that 
the external auditors will identify the problem during their year-end audit without the help of the 
ТАҚ” and (2) management's ability to correct the weakness prior to year-end, which is affected by 
the timing of IAF Section 404 procedures. If we assume the external auditors will not discover an 
existing MW, then IAF detection directly increases the likelihood of disclcsure.> In this scenario, 
management can avoid disclosure only if they can correct the problem prior to year-end.° If we 
assume that the external auditors will detect all extant MWs at year-end, regardless of IAF 
activities, then IAF detection of a MW prior to year-end can potentially decrease the likelihood of 
disclosure by providing management with an opportunity to remediate the control weakness. 

However, remediation requires that the IAF detect the problem early enough in the fiscal year 
for management to rectify the problem and that management deems the problem severe enough to 
take action. Ettredge et al. (2006) and PCAOB reviews of AS No. 2 (PCAOB 2005, 2007b) 
suggest that firms’ Section 404 procedures were performed near or at year-end during the time 
period of our sample (2004—2006), leaving management little time for remediation. Moreover, 
Bedard and Graham (2011) report that management tends to underclassify the severity of control 
problems relative to external auditors, and thus may not recognize that a control problem is serious 
enough to warrant corrective action. Given this evidence, we conclude that it is unlikely that 
management will be able to successfully remediate the control problems in a timely manner. 
Accordingly, we expect that better IAF detection capabilities will lead to more MW disclosures, 
although we cannot rule out the possibility that enhanced JAF detection capabilities can reduce 
MW disclosures. 

We draw on Prawitt et al. (2009) and professional guidance to develop measures of IAF 
quality (AICPA 1991; ПА 2008). These suggest that ТАҒ quality measures encompass (1) compe- 
tence, (2) objectivity, (3) relative investment in the ТАБ, and (4) the nature and scope of ТАЕ 
activities. We group the first three measures together as IAF attributes because they address the 
characteristics of organizations performing internal audit activities, and thus are covered by the 
Attribute Standards of the International Standards for the Professional Practice of Internal Au- 
diting (ПА 2008). The last construct, the nature and scope of IAF activities, captures IAF practices 
that are guided by Performance Standards (ПА 2008). 


IAF Quality Attributes 


ПА Attribute Standards stipulate that internal auditors possess the knowledge, skills, and other 
competencies needed to effectively carry out their responsibilities (ПА 2008). To prevent and 
detect internal control irregularities, internal auditors must have a thorough understanding of 
company operations, processes, and procedures, and they must be able to design and implement 
tests to determine whether processes and procedures are working as intended (Clark et al. 1980). 


The literature is only beginning to address this complex issue of which party (firm or external auditor) is more likely to 
detect control deficiencies (Bedard et al. 2009; Bedard and Graham 2011). Therefore, we do not attempt to assess the 
likelihood that the external auditors would fail to detect a MW had it not been identified by the internal auditors. 

IIA guidance specifies that the IAF report all significant risk exposures and control issues to senior management and the 
board. Senior management is obligated under Section 404 to disclose a material weakness. It is unlikely that both 
management and the audit committee would knowingly violate SOX by failing to disclose a significant control problem 
to the external auditors. 

According to the SEC, "if management's evaluation process identifies material weaknesses, but all material weaknesses 
are remediated by the end of the fiscal year, management may conclude that internal contro: over financial reporting is 
effective at the end of the fiscal year" (SEC 2007, 10, footnote 20). 

We use the most recent version of the ПА Standards, issued October 2008, to guide our choice of quality measures. This 
version is very similar to earlier versions that were in place at the time the GAIN survey was conducted. 
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External auditing standards state that external auditors should consider professional certification, 
professional experience, and training in evaluating internal auditor competence (SAS No. 65, 
AICPA 1991). Consistent with this statement, prior research suggests that external auditors’ evalu- 
ations of internal auditor competence are based upon professional certifications (Brown 1983) and 
the experience of IAF personnel (Messier and Schneider 1988). 

ILA Standards require internal auditors to be independent and objective in performing their 
work (ITA 2008). An objective IAF is less likely to be influenced by management in evaluating 
controls and reporting internal control problems to the audit committee. Professional governance 
guidelines and standards suggest that the reporting relationship between the Chief Audit Executive 
(CAE) and audit committee is a key determinant of internal auditor objectivity (Gramling et al. 
2004). Consistent with this premise, Bedard and Graham (2011) find that ICD severity levels are 
higher when the party performing the firm’s Section 404 work reports directly to the audit com- 
mittee rather than to management. 

Practitioners and academics generally contend that management can improve IAF quality by 
increasing resources allocated to the IAF (e.g., Gramling et al. 2004; Ge and McVay 2005). 
Greater ТАЕ resources enable management to hire and retain more competent personnel and 
improve the effectiveness of IAF consulting and assurance activities. In a descriptive analysis of 
MW disclosures, Ge and McVay (2005) find that poor internal control is usually related to an 
insufficient commitment of resources to accounting controls. Studies of the economic determinants 
of ICD disclosures find that firms with fewer resources available for internal controls are more 
likely to disclose ICDs (Ashbaugh-Skaife et al. 2007; Doyle et al. 20072). Recent external auditing 
studies suggest that the greater resources available to large and mid-tier audit firms, relative to 
small audit firms, enable them to develop more effective SOX compliance processes that lead to 
greater ICD detection and disclosure (Bedard et al. 2009; Bedard and Graham 2011). Thus, we 
expect firms with greater IAF investment will implement stronger Section 404 procedures in 
support of management's evaluation of ICFR. 

It is not clear ex ante how ТАЕ quality attributes will affect MW disclosures. Competence, 
objectivity, and investment enhance quality and, thus, should reduce the likelihood that material 
weaknesses exist. Conversely, if a material weakness does exist, a higher quality IAF will be more 
likely to detect it. Given the bi-directional implications for material weakness disclosures, we 
present a nondirectional hypothesis for the three quality attributes. 


H1: The likelihood that a firm reports a material weakness is significantly associated with 
IAF competence, ТАЕ objectivity, and IAF investment. 


Nature and Scope of IAF Activities 


External auditing standards contend that external auditors evaluate the nature, timing, and 
extent of IAF fieldwork in audit planning and determining whether to rely on the work of internal 
auditors. AICPA and ПА guidance suggest that the following factors are relevant to the financial 
reporting process: (1) use of fieldwork quality assurance techniques, (2) inclusion of financial 
reporting processes in audit scope, (3) communication of grades or summary opinions on control 
effectiveness, (4) follow-up of previously identified control problems, and (5) coordination with 
external auditors (AICPA 1991; ПА 2008). 

The quality of audit fieldwork performed is critical to the detection of internal control defi- 
ciencies. Schneider (1984, 1985) and Brown and Karan (1986) find that external auditors place 
more emphasis on the quality of work performance than either competence or objectivity in 
evaluating the IAF, Quality assurance (QA) techniques help ensure that IAF fieldwork is effective 
and appropriate. QA activities include direct supervision, independent working paper review, 
solicitation of audit client feedback, peer review by fellow staff members, and the use of working 
paper checklists. 
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As Prawitt et al. (2009, 1261) note, firms have broad latitude in setting the scope of the IAF, 
and its focus can vary widely across organizations. IAF responsibilities generally encompass a 
range of control activities not directly related to financial reporting, such as operational audits, 
systems audits, and internal consulting projects. Since ICFR forms only a subset of a firm's overall 
internal controls, external auditors must evaluate the extent to which IAF activities are relevant to 
the financial statement audit in determining whether they can rely on ТАР work (SAS No. 65; 
AICPA 1991). We expect that greater ТАР attention to financial reporting will affect both the 
existence and detection of material weaknesses. 

Internal auditing standards direct internal auditors to document and communicate engagement 
results to management, and, where appropriate, include their overall opinion (IIA 2008). A grade 
or rating is a succinct means of conveying an opinion on the risk posed by the unit or functional 
area audited, and the ПА provides specific guidance on the practice of rating internal controls (ПА 
20092). A recent study by PricewaterhouseCoopers (PwC 2006) reports an increase in the preva- 
lence of grading and concludes that grading is now generally considered to be a best practice. 
Grading motivates managers to put adequate controls in place to avoid a poor grade and facilitates 
design and implementation of monitoring systems. Both help prevent the existence of control 
problems. Grading also promotes rapid assessment of control risks by management, audit com- 
mittees, and external auditors (PwC 2006), and thus can help management and external auditors 
implement the risk-based approach to Section 404 compliance recommended by professional 
guidance (PCAOB 2007a; SEC 2007). As an attention-directing tool, grading improves an audi- 
tor’s assessment of the risk of financial misstatements and thereby facilitates appropriate allocation 
of audit resources to the evaluation and testing of relevant company-level controls (Hogan and 
Wilkins 2008; PCAOB 2007a; Wright and Ashton 1989). Accordingly, we expect IAF grading to 
improve the effectiveness of both internal and external auditors’ Section 404 audit procedures, 
thus increasing the likelihood that extant material weaknesses are detected. 

As with the IAF quality attributes, it is not clear ex ante how using QA techniques, focusing 
on financial reporting controls, and grading internal controls will affect the likelihood that a 
material weakness is reported. These monitoring activities can prompt managers to take preven- 
tative action in anticipation of the IAF's rigorous review (Brown and Pinello 2007). Conversely, 
such activities increase the likelihood that the IAF discovers and discloses any existing control 
problems. Further, IAF grading can increase the effectiveness of both internal and external Section 
404 compliance processes, leading to greater MW detection. Due to the bi-directional implications 
for material weakness disclosures, we test a nondirectional hypothesis. 


H2: The likelihood that a firm reports a material weakness is associated with: (1) the extent 
to which the IAF uses quality assurance techniques, (2) the extent to which IAF activi- 
ties address financial reporting activities, and (3) whether tae IAF grades audit 
engagements. 


IIA performance standards and SOX guidance require the Chief Audit Executive (CAE) to 
establish and maintain a system to monitor the disposition of previously identified control prob- 
lems (ША 2008, 2002). Follow-up procedures can reduce the likelihood thet a material weakness 
exists at year-end and, hence, must be publicly disclosed. First, follow-up procedures provide the 
impetus for management to correct less severe control problems, thereby preventing them from 
becoming material weaknesses. Second, if the firm or its external auditors detect an existing 
material weakness at any time prior to year-end, then management can avoid disclosing it if they 
resolve the problem prior to year-end. Hence, IAF follow-up procedures can prompt management 
to remediate identified weaknesses in a timely manner. Accordingly, we predict that follow-up 
procedures will be negatively associated with MW disclosures. 
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H3: Firms whose IAFs follow-up on control problems are less likely to report material 
weaknesses. 


Both internal and external auditing standards encourage auditors to coordinate efforts related 
to an integrated audit under AS No. 5 (PCAOB 20072; ПА 2008). Methods of coordination include 
conducting joint risk or planning sessions, performing audits of specific processes or locations, 
and loaning IAF staff to the external auditor. As firms’ "in-house" experts on internal control, 
internal auditors possess company-specific knowledge about controls, operations, and the financial 
reporting process that can aid external auditors in implementing the top-down, risk-based ap- 
proach recommended in AS No. 5 (PCAOB 20072). Coordination should improve an external 
auditor's assessment of the risk of misstatements, thus facilitating appropriate allocation of audit 
resources to evaluation and testing of relevant controls (Hogan and Wilkins 2008; PCAOB 20072). 
Reliance on the work performed by the ТАР, both independently and under the direction of the 
external auditors, can improve the effectiveness of Section 404 documentation and testing. Ac- 
cording to the PCAOB (2007b, 8-9), “ап auditor who appropriately uses the work of others can 
achieve the objectives of the audit while not duplicating effort in lower-risk areas, and also is 
better able to focus his or her own efforts on higher-risk controls." Moreover, to the extent that 
coordination reduces time pressure on the external auditor, audit effectiveness should increase 
with coordination (McDaniel 1990). Given the value the IAF can add to the external auditor's 
Section 404 processes, we propose that internal-external auditor coordination will enhance the 
overall effectiveness of the external auditor's Section 404 process, thereby leading to greater 
detection and disclosure of MWs.? 

Our prediction assumes that most of the external auditor's Section 404 documentation and 
testing takes place at year-end, giving management little opportunity to correct any control prob- 
lems detected by the external auditors. While we recognize that coordination can and does occur 
throughout the year, we expect that much of the benefit of that coordination is realized at year-end, 
when external auditors conduct their most rigorous reviews of the financial statements and the 
reporting process (Frankel et al. 2002; Brown and Pinello 2007; Bedard et al. 2009). In particular, 
the rapid enactment of Section 404 combined with delays in managements’ completion of SOX 
procedures left external auditors little choice but to perform most Section 404 work at year-end for 
the time period of our sample of MWs (PCAOB 2005; Ettredge et al. 2006). The benefits of IAF 
coordination and the year-end timing of external auditors' Section 404 processes lead to the 
following hypothesis: 


H4: Firms in which IAFs coordinate with external auditors are more likely to report material 
weaknesses. 


IV. RESEARCH DESIGN 
Data and Sample Selection 


Data for this study come from multiple sources. We use firm-level data collected by the ПА 
through their GAIN survey for the years 2003 and 2004. The GAIN database consists of CAEs' 
responses to a comprehensive survey designed to measure various aspects of an organization's 


The IAF must be of sufficient quality, i.e., it must be objective and competent for external auditors to rely on their work. 
Thus, even though the ТАЕ are employees of the firm being audited, we expect that the LAF will be objective in 
assessing controls and reporting control problems to the external auditors. 

External auditor reliance on the IAF is increasing in IAF quality (e.g., Schneider 1984, 1985; Maletta 1993), and as 
previously discussed, it is not a priori clear how IAF quality affects the likelihood of MW disclosure. Our tests of the 
relation between IAF coordination and MW disclosure include explicit measures of IAF quality, and thereby control for 
the effect of IAF quality on disclosure. 
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internal audit activities. The annual survey captures information on several topics, including in- 
ternal audit department costs, oversight and reporting responsibilities, audit committee, audit 
life-cycles, and financial measures.!? Next, we collect firm data on MW disclosures from EDGAR, 
firm financial data from Compustat, stock return data from CRSP, audit fee and restatement data 
from Audit Analytics, and institutional ownership data from CDA/Spectrum. 

Our initial sample contained 1,356 responses from 935 GAIN survey respondents collected in 
years 2003 and 2004. Since firm names are not reported in the survey data, we employ a matching 
algorithm to identify firms based on reported SIC code, total assets, revenues, and number of 
employees. Next, we eliminate: (1) 687 firms for which data were unavailable on Compustat, (2) 
six firms with American Depositary Receipts or missing CRSP data, (3) six firms with missing 
10-K data after August 2002, (4) six firms missing stock information from both Compustat and 
CRSP, and (5) eight firms missing necessary GAIN data, resulting in a semple of 329 firm-year 
observations from 222 firms." For firms that report survey data for both 2003 and 2004, we use 
only the data from the 2004 survey year, which yields our final sample of 222 firms. Finally, we 
eliminate five firms that only report material weaknesses under SOX Section 302, and three firms 
that are non-accelerated filers and therefore not subject to SOX Section 404 reporting 
requirements.’ The final sample consists of 214 firms (Table 1). 


IAF Quality as a Determinant of Material Weaknesses 

We model the probability that a firm reports a material weakness as a function of IAF 
attributes, the nature and scope of IAF activities, and a set of control variables (Equation (1)). We 
estimate this model using a logit regression. 


Prob(MW = 1) = Во + B EXPERIENCE + BjEDUCATION + BCERTIFICATION 
+ B,TRAINING + BCAEAC + ВСАЕОЕКІСЕВ + B3IASIZE 
+ BgFIELDWORKQA + ВЛАСКАРЕ + Bj9FINANCIALFOCUS 
+ Bıı FOLLOWUP + BjCOORDINATION  BjSEGMENTS 
+ Bi4FOREIGNTRANSACTIONS + B,5M &A + BjgRESTRUCTURE 
+ BySALESGROWTH + Ву МУЕМТОВУ + BjMARKETVALUE + ByLOSS 
+ ВдСЕО- ByySHUMWAY + ВАСЕ + By,AUDITORSPECIALIST 
+ BosBLUERIBBONAC + 851809000 + Bj,RESTATEMENT 
+ BjgAUDITORRESIGN + ByoINSTITUTIONALOWNERSHIP 
+ ByyREGULATEDINDUSTRY + в (1) 


10 The data were subject to various validation checks, including validation measures built into the questionnaire and 
manual procedures and reasonableness tests applied after the data had been collected. The GAIN database covers 
a wide range of institutions including publicly traded companies, private companies, educational institutions, 
divisions within companies, and governmental institutions. More information can be found at the GAIN website: 
http://www.theiia.org/guidance/benchmarking/gain/. 

П We eliminate firms with missing values for JASIZE, EXPERIENCE, EDUCATION, and CERTIFICATIONS, and set 
missing values to zero for TRAINING, OBJECTIVITY, FIELDWORK QA, FOLLOWUP, IAGRADE, FINANCIALFO- 
CUS, and COORDINATION. 'There are no missing values for CAEOFFICER, BLUERIBEONAC, and 1809000. In а 
robustness check, we remove the firms with missing values in any ТАР variables and obtain similar results. 

12 Inclusion of the five firms that report material weaknesses under Section 302 does not materially affect our results ог 
conclusions. Ín untabulated robustness tests, the coefficient on FINANCIALFOCUS retains its hypothesized sign, but 
becomes insignificant at the 10 percent level; coefficients on the other IAF quality measures have the same signs and 
similar significance levels. 


The Accounting Review January 2011 
American Accounting Association 


The Role of the Internal Audit Function in the Disclosure of Material Weaknesses 297 





TABLE 1 
Sample Selection and Distribution by Year 
Panel A: Sample Selection 
Firm-Year Observations Firms 


Firm-year observations obtained from GAIN database for survey years 1,356 935 
2003 and 2004* 
Less: 
Those not covered by Compustat (990) (687) 
American Depositary Receipts (10) (6) 
Those missing SEC filings after August 2002 (6) (6) 
Those missing stock information from both Compustat and CRSP (10) (6) 
Those missing IAF data from GAIN (11) (8) 
329 222 
Elimination of 2003 firm-years if data from 2004 survey-year data are (107) -- 
available 
Elimination of firms that only report material weakness(es) under (5) (5) 
SOX Section 302 
Elimination of firms that are non-accelerated filers (with market value (3) (3) 
rn than $75 million) and therefore are not subject to SOX Section 
214 214 


Panel B: Distribution of IAF Data by Fiscal Years 


Fiscal Year n % 

2000 6 2.80 
2001 42 19.63 
2002. 57 26.64 
2003 97 45.32 
2004 12 5.61 
"Total 214 100.00 


А survey collected in 2003 or 2004 may describe earlier fiscal years. For example, survey information collected in 2003 
may pertain to 2002 or 2001. 


We retained the survey data that pre-dates the year in which a material weakness is first disclosed. In the final sample, 
all but two firms' IAF data describe firm characteristics that had been in existence at least one year prior to the year in 
which a material weakness had been disclosed. The empirical results remain similar after excluding the two firms from 
the sample. 


where MW is an indicator variable that is equal to 1 if the firm disclosed a material weakness in 
internal control, and 0 otherwise. АП model variables are defined in Table 2. 


Material Weakness Firms 

To construct the dependent variable, we examine 10-Ks and 10-Qs for the 214 sample firms 
in the EDGAR database and identify MW disclosures during the period November 15, 2004 to 
December 2006. We identify 47 firms that disclosed at least one material weakness during this 
time frame. We remove two firms due to missing data, leaving 45 firms that report at least one 
material weakness. 


November 15, 2004 coincides with the date that Section 404 became operational for accelerated filers. 
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TABLE 2 
Variable Definitions 


Internal Control over Financial Reporting 

MW — An indicator variable that equals 1 if the firm disclosed a material 
weakness under Section 302 or Section 404, and 0 otherwise. 

IAF Attributes? 

EXPERIENCE — Average number of years of auditing experience (internal and 
external) of the audit staff (B6a and B6b). 

EDUCATION — Average of the number of years of undergraduate and graduate 
education of the audit staff, based on highest degree achieved. 
Associate, Bachelor, Master, and Ph.D. degrees are assumed to take 
2, 4, 6, and 8 years of study, respectively (B5). 

CERTIFICATIONS — Percentage of professional staff members with one or more audit 
certifications (B8). 

TRAINING — Annual hours of training per internal auditar (B15b). 

CAEAC — Ап average of eight 0-1 survey items that measures the amount of 
relevant internal control information the CAE reviews with the 
audit committee. Four items address the control environment: risk 
assessment system, overall assessment of ccrporate control 
environment, assessment of control environment by major 
subsidiary of operating entity, coordination of internal auditing with 
external auditor's plan. Four items address the IAF: significant 
findings, audits performed, fraud-conflicts о? interest; and results of 
monitoring programs concerning compliance with law, code of 
conduct and ethics (C9, C10). 

CAEOFFICER — An indicator variable that equals 1 if the CAE position is an officer 
of the firm (E7). 

IASIZE = The total annual operating costs of the IAF (Blg) divided by the 
firm assets (A2), multiplied by 100. 

IAF Activities 

FIELDWORKQA = An average of seven survey items coded from 0 to 2 (0: Never, 1: 
Occasionally, and 2: Regularly) that measures the extent to which 
various quality assurance (QA) techniques are used in fieldwork. 
The QA techniques are: direct supervision; independent working 
paper review; audit client feedback; peer review by fellow staff 
members, working paper checklists, and ticklists; management 
participation; and other (H4). 

IAGRADE = An indicator variable that equals 1 if the final internal audit report 
поща a grade or score as determined by фе results of the audit 

H12). 

FINANCIALFOCUS — An average of six survey items that measure how frequently the 
ТАЕ performs audits of various financial activities (0: Never, 1: 
Occasionally, and 2: Regularly). The survey classifies the following 
five activities as financial: adequacy of internal accounting controls; 
accuracy, reliability, and completeness of financial records; 
usefulness of financial reporting for management control and 
decision making; impact of changes in accounting rules or 
regulations; interim quarterly financial results reported externally 
(4); We also include the frequency of GAAP compliance audits 

G3e). 

FOLLOWUP — An indicator variable that equals 1 if there is a formal follow-up 
procedure to test the implementation of corrective action (H11). 

COORDINATION — An indicator variable that equals 1 if internal audit coordinated 
audit services the external auditors (Ғ9а). 


(continued on next page) 
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TABLE 2 (continued) 


Internal Control over Financial Reporting 
Control Variables 


SEGMENTS = Natural log of the sum of the number of operating and geographic 
segments in 2003 reported by Compustat Segment file. 

FOREIGNTRANSACTIONS = An indicator variable that equals 1 if the firm reports a non-zero 
value for foreign currency translations (#150) in 2003. 

M&A = Proportion of years from 2001 to 2005 that a firm was involved in 
a merger or acquisition (Compustat AFNT #1). 

RESTRUCTURE = Proportion of years from 2001 to 2005 that a firm was involved in 


restructuring. We identify a firm undergoing a restructuring if any 
of the following Compustat data items are non-zero: #376, #377, 
#378, or #379. 


SALESGROWTH = Proportion of years from 2001 to 2005 that a firm’s annual 
industry-adjusted sales growth (#12) falls into the top quintile. 

INVENTORY = Average inventory to total assets (#3/#6) from 2001 to 2005. 

MARKETVALUE = Natural log of the market value of equity (#199 X #25) in 2003. 

LOSS = Proportion of years from 2001 to 2005 that a firm reports negative 
earnings before extraordinary items (#18). 

CFO = Average cash flows from operations to total assets (#308/#6) from 
2001 to 2005. 

SHUMWAY = The probability of bankruptcy as predicted by Shumway’s (2001) 
default hazard model. 

AGE = Natural log of the number of years the firm has Compustat data 
since 1980. 

AUDITORSPECIALIST = An indicator variable that equals 1 if the firm was audited by an 


industry specialist auditor. We define industry specialist as an 
auditor that collects the greatest percentage of its audit fees in the 
client’s industry. 

BLUERIBBONAC = A score with a range of 0 to 1 measuring the extent to which the 
firm has implemented the Blue Ribbon Audit Committee 
recommendations. The score is an average of ten Blue Ribbon 
Committee recommendations coded 0/1: all members meet new 
criteria of independence; at least three independent outside 
directors; all members possess core skills including finance literacy; 
committee’s charter is reassessed annually; proxy report states 
committee fulfills its charter; committee is accountable for auditor 
relations; outside auditors disclose all consulting assignments; 
auditors discuss adequacy of company accounting; 10-K’s MD&A 
discloses financial statement reviews and discussions; and auditors 
review quarterly reports and 10-Q before release. 


1809000 — Ап indicator variable that equals 1 if the firm is involved іп ISO 
9000. 

RESTATEMENT — An indicator variable that equals 1 if the firm had a restatement 
from 2001 to 2005. 

AUDITORRESIGN — An indicator variable that equals 1 if the firm experienced an 
auditor resignation during 2001 to 2005. 

INSTITUTIONALOWNERSHIP = Тһе percentage of institutional ownership as of December 31, 
2003. 

REGULATEDINDUSTRY — An indicator variable that equals 1 if the firm is in financial 


services or utility industry. 


^ Letters and numbers in parentheses for IAF attributes and activities refer to item codes in 2004 GAIN Survey. 
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IAF Attributes 


IAF quality attributes include the competence and objectivity of internal audit personnel and 
the amount firms invest in the IAF. In measuring these attributes, we generally follow professional 
guidance and Prawitt et al. (2009), who also use the GAIN data to measure IAF quality. To 
measure IAF competence, we use four variables, EXPERIENCE, EDUCATION, CERTIFICA- 
TION, and TRAINING (SAS No. 65, AICPA 1991; Prawitt et al. 2009). EXPERIENCE is defined 
as the average number of years of internal and external auditing experience of the audit staff; 
EDUCATION is the average number of years of undergraduate and graduate education; CERTI- 
FICATION is the percentage of professional staff members with one or more audit certifications; 
and TRAINING is the average annual hours of training per staff member.'^ ПА standards and 
guidelines for objectivity recommend that the CAE regularly communicate the results of ongoing 
internal audit activities to the audit committee (Practice Advisory 1110-1, ПА 2009b, 1), and that 
the CAE report to a level within the organization "that allows the internal audit activity to fulfill 
its responsibilities" (Attribute Standard 1110, IIA 2008, 2). We capture CAE-audit committee 
communication with the variable CAEAC, which measures the amount of control-related informa- 
tion the CAE reviews with the audit committee. Examples of such information include assess- 
ments of the control environment in total and by major subsidiary, areas audited, significant 
findings, fraud, and ethics compliance. A second proxy for objectivity, CAEOFFICER, equals 1 if 
the CAE position is an officer of the firm, and 0 otherwise. We expect that a CAE who is an officer 
will have greater support from the board and senior management, which in turn should assist the 
IAF in performing work free from interference (Practice Advisory 1110-1, ПА 20095). We 
measure IAF investment, JASIZE, as the total IAF annual operating costs divided by total assets, 
based оп a measure developed by Prawitt et al. (2009).16 


Nature and Scope of IAF Activities 


The nature and scope of IAF activities investigated includes: (1) use of quality assurance 
techniques in fieldwork, (2) inclusion of financial reporting processes in audit scope, (3) inclusion 
of grades or summary opinions on control effectiveness in audit reports, (4) follow-up of previ- 
ously identified control problems, and (5) coordination with external auditors. We measure the 
extent to which the IAF uses quality assurance techniques, FIELDWORKQA, with a summative 
score of variables that. measure the frequency with which various QA techniques are used. FI- 
NANCIALFOCUS is an index compiled from several items that measure the extent to which the 
IAF audits various activities related to financial reporting." The indicator variable IAGRADE 
equals 1 if the final internal audit report includes a “grade” or “score” for areas reviewed. The 
indicator variable FOLLOWUP equals 1 if internal auditors formally monitor the resolution of 


14 Our measures for EXPERIENCE and CERTIFICATIONS differ slightly from Prawitt et al. (2009), while our training 
measure is the same. Prawitt et al. (2009) include only internal audit experience in their experience measure; we include 
both internal and external audit experience because external auditing experience is also valcable in conducting internal 
audits. Prawitt et al. (2009) calculate CERTIFICATIONS with two different survey items that measure separately the 
number of CIA and CPA designations. This approach double-counts individuals who have both designations. To avoid 
double-counting, we use a single survey question that asks for the "total number of professional staff with one or more 
audit certifications." 
Prawitt et al. (2009) measure objectivity with a single dichotomous variable indicating whether the CAE reports to the 
audit committee. Robustness tests using this variable give materially similar results in our sample. 
Prawitt et al. (2009) also scale LAF operating costs by assets. However, they then subtract the average level of industry 
investment (computed using all GAIN firms) from this amount and convert this measure to a dichotomous variable by 
assigning a value of 1 to firms whose investment equals or exceeds the average level of investment for that firm's 
particular industry. We use Prawitt et al.'s (2009) dichotomous measure in robustness tests and it does not affect the 
signs or significance of our IAF measures. 
17 Prawitt et al. (2009) measure FINANCIALFOCUS with a survey item that gives the percentage of time the IAF spends 
performing financial audit work. In our sample, there are too many missing values (78 out of 214) for this item. 
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previously identified control problems, and 0 otherwise. The indicator variable COORDINATION 
is coded 1 if internal auditors coordinate services with external auditors, and 0 otherwise. The 
survey cites the following coordination methods: loaning staff to external auditors; performing 
complete or partial audits of specific locations, products, or functions; conducting joint annual 
planning sessions; and conducting joint risk or control sessions. Figure 2 summarizes hypoth- 
esized associations between the likelihood of a MW disclosure and variables measuring ТАЕ 
attributes. i 


Control Variables 


In Equation (1) we use the following variables to control for several firm-specific factors that 
prior research has shown to be correlated with the likelihood that a MW is disclosed: (1) SEG- 
MENTS and FOREIGNTRANSACTIONS proxy for internal control risks that stem from organiza- 
tional complexity; (2) RESTRUCTURE, M&A, and SALESGROWTH proxy for internal control 
risk associated with rapid organizational change; (3) INVENTORY captures internal control risk 
associated with inventory and product obsolescence costs; (4) MARKETVALUE controls for the 
level of resources available to invest in internal controls; (5) LOSS, CFO, and SHUMWAY measure 
financial distress and reflect a firm's diminished ability to adequately invest in internal controls; 





FIGURE 2 
Hypothesized Relations between IAF Attributes and Activities and the Likelihood That a 
Firm Discloses a Material Weakness 
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'8 We also used Altman's Z-score in place of the Shumway score to evaluate the sensitivity of the model. The signs and 
significance levels of IAF variables remain unchanged in our full model. . 
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(6) AGE measures firm age; (7) AUDITORSPECIALIST is a dichotomous variable indicating 
whether the external auditor is an industry specialist; and (8) BLUERIBEONAC controls for the 
extent to which firms have implemented Blue Ribbon Committee recommendations (Carcello et 
al. 2005; Krishnan 2005; Ashbaugh-Skaife et al. 2007; Doyle et al. 2007a; Stephens 2009). We 
include 1809000 to control for firm certification by the International Standards Organization (ISO) 
because ISO implementation may potentially improve the internal controls associated with orga- 
nizational business practices. The variables RESTATEMENT and AUDITORRESIGN control for 
existing internal control problems associated with earnings restatements and auditor resignations, 
respectively, during the 2001-2005 period. Finally, INSTITUTIONALOWNERSHIP and REGU- 
LATEDINDUSTRY proxy for external monitoring mechanisms that may be associated with the 
frequency of material weakness disclosures. Variable definitions are summarized in Table 2. 


V. RESULTS 
Descriptive Statistics 

Table 3 provides descriptive statistics for the firms in our sample, along with comparative 
statistics for firms in the Compustat universe. Panel A of Table 3, which presents the distribution 
of sample firms by industry, indicates that the sample contains a high concentration of firms in 
regulated industries. Wilcoxon tests, reported in the extreme right column cf Panel B, indicate that 
our sample firms are significantly larger, older, and more profitable than those in the Compustat 
universe. Prior research suggests that firms in regulated industries, and those that are older, larger, 
and more profitable, make greater investments in the IAF (Wallace and Kreutzfeldt 1991; Carcello 
et al. 2005; Goodwin and Kent 2006). The industry, size, and profitability comparisons are con- 
sistent with the propensity for GAIN survey respondents to represent firms that have relatively 
large internal audit functions. To the extent that firms with large internal audit functions are more 
prone to respond to the GAIN survey, our tests are potentially biased against detecting associations 
between IAF quality and material weaknesses. 

Panel C of Table 3 compares the industry distribution of reported material weaknesses in our 
study sample to the distribution of firms reported in Doyle et al. (2007a). Approximately 21 
percent of our sample firms disclosed at least one material weakness during the period November 
2004 through December 2006. This is comparable to Doyle et al. (2007b), who find that 17 
percent of their sample firms report material weaknesses during the period from August 2002 
through October 2005. 

Table 4 presents descriptive statistics for the independent variables in our model for the 
overall sample, partitioned on MW disclosures. Univariate tests of differences between the parti- 
tions indicate that material weakness firms are more likely to have internal auditors that issue 
grades in their audit reports (64 percent versus 42 percent; p = 0.019), and have fewer financial 
reporting activities audited by their internal auditors. Consistent with p-ior research, material 
weakness firms have significantly higher incidences of restructuring activities and losses, lower 
cash flows from operations, and higher bankruptcy risk. Univariate tests also show that a signifi- 
cantly greater proportion of material weakness firms report foreign currency adjustments and 
auditor resignations. Finally, the material weakness firms in our sample tend to be smaller and less 
likely to have 1809000 certifications. Overall, these results suggest that material weakness disclo- 
sures are influenced by differences in some IAF attributes in addition to previously documented 
firm characteristics. 

Table 5 provides Spearman and Pearson correlation coefficients for the independent variables 
in our model. Since IAF measures attempt to capture the same underlying construct, quality, we 
find some significant correlations between IAF attributes and activities (Table 5, Panel A). In 
particular, two measures of competence, CERTIFICATIONS and EXPERIENCE, are highly corre- 
lated (г = 0.39). Also, FINANCIALFOCUS is significantly correlated with CAEAC (r = 0.34), 
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suggesting that IAFs that focus greater attention on financial reporting activities tend to report 
more information to the audit committee. АЙ other significant correlations among our IAF quality 
measures fall below 0.30. One measure of objectivity, CAEOFFICER, is significantly associated 
with three control variables, INVENTORY (Spearman r — 0.42), CFO (Spearman r — 0.40), and 
REGULATEDINDUSTRY (Spearman г = —0.50). Investment in the ТАЕ (IASIZE) is significantly 
and negatively correlated with the two control variables, MARKETVALUE (Spearman r — —0.46; 
Pearson r = —0.35) and REGULATEDINDUSTRY (Pearson r = —0.41). Remaining correlations 
between IAF quality measures and control variables are below 0.40. 


Multivariate Results 


Table 6 reports the results of our logistic regression. We report parameter estimates, marginal 
effects, and the change in the probability of a firm disclosing a material weakness as a result of 
moving from the first to the third quartile value of the independent variable, while holding other 
regressors at their mean values.? We report the results for full and stepwise regression models in 
Panels A and B, respectively. All p-values refer to two-tailed tests of significance. 

We find little support for H1, which predicts associations between MW disclosures and the 
IAF quality attributes of competence, objectivity, and IAF investment. Among our four proxies for 
IAF competence, only EDUCATION is significantly associated with the probability that a firm 
reports a material weakness (Coeff. = — 1.682, р < 0.05). The results in column (4) indicate that 
moving from first to third quartile of EDUCATION decreases the probability of material weakness 
disclosure by 1.8 percent. Contrary to H1, the remaining measures of IAF competence (EXPERI- 
ENCE, CERTIFICATION, and TRAINING) and our proxies for IAF objectivity (CAEAC or CAE- 
OFFICER) and IAF investment (JASIZE) are not significantly associated with MW disclosures. 
The lack of a statistically significant association could reflect three factors. First, IAF competence, 
objectivity, and investment improve the IAF's ability to prevent material weaknesses from occur- 
ring, while also increasing the likelihood that existing material weaknesses are detected and 
disclosed. Hence, these opposing effects may offset each other. Second, the small size and relative 
homogeneity of our sample, combined with the large number of control variables, lowers the 
power of our statistical tests. Finally, CAEAC and JASIZE are significantly correlated with multiple 
control variables that are significant predictors of MWs, making it difficult to isolate relations 
between these IAF attributes and MW disclosures. Accordingly, further investigation into these 
specific aspects of IAF quality is warranted. 

We find strong support for H2, which predicts associations between MW disclosures and the 
following IAF activities: FIELDWORKQA, FINANCIALFOCUS, and IAGRADE. The coefficients 
on FIELDWORKQA and FINANCIALFOCUS are negative and significant (Coeff. — —2.674, p « 
0.01; Coeff. = — 1.955, p < 0.05, respectively), indicating that MW disclosures are decreasing in 
the use of quality assurance techniques during fieldwork and the extent to which IAF scope 
includes financial reporting processes. Moving from first to third quartile of FIELDWORKQA and 
FINANCIALFOCUS decreases the probability of firms reporting a material weakness by 2.1 per- 
cent and 2.7 percent, respectively. We also find that JAGRADE is positively and significantly 
associated with the probability of disclosing a material weakness (Coeff. — 1.445, p « 0.05), 
suggesting that grades facilitate Section 404 risk assessment, and thereby increase the likelihood 
that extant MWs are identified. Firms whose internal auditors issue grades or ratings are 3.1 
percent more likely to report a material weakness than those that do not. 


19 , ' 
We compute the marginal effects as ef X / (1 + e£ Ху? where X is computed using the mean value of X. If X contains 
logs, then we substitute the mean of the original variable into the logarithm function. 
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As predicted by H3, firms with IAFs that follow-up on extant control problems are signifi- 
cantly (Coeff. = —1.213, p « 0.10) less likely to report a material weakness. This result suggests 
that monitoring the remediation of previously identified control problems prompts management to 
correct them in a timely manner. А one-unit change іп FOLLOWUP is associated with a 3.4 
percent decrease in the probability of reporting a material weakness. 

Consistent with H4, firms are significantly more likely to report a material weakness (Coeff. 
= 4.095, p « 0.01) when the IAF coordinates audit activities with the external auditors. This 
result suggests that external auditors can increase the effectiveness of their Section 404 procedures 
by coordinating with a firm's IAF. The change in probability analysis shows that a unit change in 
COORDINATION increases the probability of firms reporting a material weakness by 3 percent. 

Our findings regarding the nature and scope of ТАЕ activities suggest implications for prac- 
tice. The results for FIELDWORKQA and FINANCIALFOCUS indicate that these audit practices 
prevent MWs from occurring and/or ensure that extant MWs are detected early enough in the 
firm's Section 404 testing to enable management to correct problems. The year-end timing of most 
Section 404 work suggests that these activities are more preventative than detective. Together with 
tests of the FOLLOWUP variable, our results imply that firms can potentially improve ICFR by 
expanding fieldwork quality assurance practices, devoting greater effort to financial statement 
reporting processes, and aggressively following up on previously detected audit exceptions. 

The positive relations between MW disclosures and both JAGRADE and COORDINATION 
suggest that these activities increase the effectiveness of Section 404 compliance processes by 
facilitating risk assessment. IAF grading promotes rapid assessment and prioritization of control 
risks by management, audit committees, and external auditors (PwC 2006). Similarly, information 
gained through coordination enables external auditors to alter the nature, timing, and extent of 
their testing to reflect risk levels. Accordingly, these findings confirm the appropriateness of recent 
regulatory guidance that instructs firms and external auditors to use a risk-based approach to 
Section 404 compliance (SEC 2007; PCAOB 2005, 2007b). The results are also consistent with 
recent PCAOB (2005, 2007а) guidance that encourages external auditors to more effectively use 
the work of others.” 4 

Тһе signs and significance levels of our control variables are generally consistent with prior 
research, except for the negative coefficient on SEGMENTS and the positive coefficient on AGE 
(Ashbaugh-Skaife et al. 2007; Doyle et al. 2007a; Krishnan and Vishwanathan 2007; Stephens 
2009). We attribute these differences to our sample, which has a much larger proportion of utilities 
and financial firms than samples used in prior studies. 

Since many independent variables are not statistically significant in Panel A of Table 6 and we 
have a relative small sample size, we also present logit results for a stepwise hierarchical regres- 
sion model in Panel B.” The coefficient signs in the reduced model are consistent with the full 
model, while the sizes, significance levels, and marginal effects of the coefficients are generally 
larger. 

Sensitivity Analysis 

One limitation of our study is that the IAF data describe different years for different firms. 

IAF data for 57 firms in the sample correspond to the pre-SOX time period. This could bias our 


20 The methods of coordination used (multiple methods per firm are permitted) and the percentage of firms using each 
method are as follows: (1) loan to external auditors, 28 percent, (2) perform complete or partial audit of specific 
locations, products or functions, 58 percent, (3) conduct joint annual planning sessions, 58 percent, and (4) conduct joint 
tisk or control sessions, 39 percent. We classify IAFs that responded "yes" to (1) and/or (2) as performing work for use 
by external auditors. We perform robustness tests with a continuous measure based on the number of coordination 
methods used. This variable gives similar results (Coeff.— 2.179, p — 0.046). However, multiple methods of coordi- 
nation are not necessarily indicative of more coordination; thus, we use an indicator variable in our main results. 

We used a p-value of 0.30 as a discretionary cutoff for a variable to be included in the model. Additional sensitivity tests 
using cut-off p-values of 0.2, 0.4, and 0.5 produce materially similar results. 
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results if the year of ТАЕ data collection is correlated with both MW аіѕсісѕшеѕ and our measures 
of IAF quality. To test for bias, we re-estimate Equation (1) after including an indicator variable, 
SOX, to indicate whether ТАЕ data were collected on or after July 2002. Тһе coefficient оп SOX 
(Coeff. = —0.418) is not statistically significant (p = 0.509), and inclusipn of the SOX variable 
does not materially affect our results. For firms with survey data available tor both 2003 and 2004, 
we use the data from 2003 (rather than 2004) and re-estimate the models.|The results are largely 
consistent with our reported results. 

We further confirm the robustness of our results by using additional control variables for 
external monitoring. We include an additional corporate governance factor, Gompers et al.’s 
(2003) GINDEX. GINDEX is measured as the average G-index score for 2002 and 2004. This 
reduces the sample size to 200 firms. We obtain similar results for all IAF variables except 
IAGRADE and FOLLOWUP. In the full model, the coefficients for IAGRADE and FOLLOWUP 
are insignificant. We also control for the possibility that a firm audited byjone of the Big 4 audit 
firms may be more likely to disclose a material weakness. Our results remain unchanged from 
those reported in Table 6. 


VI. CONCLUSION 

This study investigates associations between material weakness disclosures and various ІАЕ 
attributes and activities using survey data collected by the ПА. Our results indicate that the nature 
and scope of IAF activities are more strongly associated with MW disclosures than the IAF 
attributes of competence, objectivity, and investment. Among IAF attribute measures, only the 
education level of the IAF is significantly associated with MW disclosures} Regarding IAF activi- 
ties, we find that MW disclosures are negatively associated with the extent|to which the IAF uses 
QA techniques in fieldwork, audits activities related to financial reportiri , and follows up on 
previously identified control problems. The year-end timing of most Section 404 work and the 
nature of follow-up procedures suggests that these activities are more likely to be preventative 
rather than detective. We also find that MW disclosures are positively yelated with both IAF 
grading of audit engagements and external-internal auditor coordination. We interpret this finding 
to indicate that these activities increase the effectiveness of Section 404 compliance processes by 
facilitating risk assessment, consistent with the risk-based approach promoted by regulatory guid- 
ance (SEC 2007; PCAOB 2005, 2007a). Together, our results have important implications for 
managers who determine IAF staffing and activities, standard-setters who provide auditing guid- 
ance, and external auditors responsible for Section 404 work. 
This study makes several important contributions to the literature. First, we expand extant 
research on both internal auditing and ICFR by using survey data from companies to document 
associations between various measures of IAF quality and material weakness disclosures. Re- 
search on the determinants of internal control deficiencies has investigated firm characteristics 
such as size, profitability, and complexity, without consideration of the role of the ТАЕ (Ashbaugh- 
Skaife et al. 2007; Doyle et al. 2007a). With the exception of Prawitt et|al. (2009), prior ТАЕ 
studies focus on how measures of IAF quality affect the quality of internal auditors’ decisions, 
without linking IAF quality to an actual outcome measure of control efjectivencss, such as a 
material weakness (Berry et al. 1987; Harrell et al. 1989; Church and Schneider 1995). Accord- 
ingly, this study complements Prawitt et al. (2009), which finds a positive relation between a 
comprehensive measure of [AF quality and earnings quality. In combination, these studies suggest 
that the IAF is an important component of the financial reporting process. Second, the bi- 
directional relations between IAF quality measures and MW disclosures provide evidence on the 
roles that specific IAF attributes and activities play in the existence and detection of MWs. Few 
archival studies directly link auditor practices to the prevention or detection of audit exceptions. 
Finally, we provide evidence that external auditors are more likely to detect material weaknesses 
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when they coordinate their efforts with the IAF. While a significant branch of the internal auditing 
literature focuses on the relation between external auditors and the IAF, no study of which we are 
aware has examined whether IAF involvement in external audits increases the effectiveness of 
external audits. 

Our study is subject to several limitations. Most notably, the small size and homogeneity of 
our sample, combined with the large number of control variables, lowers the power of our statis- 
tical tests. Accordingly, we cannot determine whether the lack of statistically significant hypoth- 
esized relations between IAF attributes is due to low statistical power or competing effects on the 
existence and detection of control problems. The stepwise tests mitigate this problem but do not 
eliminate it. Furthermore, large firms with relatively sophisticated IAFs tend to participate in the 
GAIN survey. This limits our ability to generalize findings to firms that did not respond to the 
GAIN survey. Despite these limitations, this study increases our understanding of the IAF's role in 
ICFR. 
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ABSTRACT: This study examines the causes and consequences of internal control 
deficiencies in the nonprofit sector using a sample of 27,495 public charities from 1999 
to 2007. We first document that the likelihood of reporting an internal control problem 
increases for nonprofit organizations that are in poor financial health, growing, more 
complex, and/or smaller. We then present evidence that the disclosure of weak internal 
controls over financial reporting is negatively associated with subsequent donor support 
received after controlling for the current level of donor support and other factors influ- 
encing donations. We likewise report a negative association between internal control 
problems and subsequent government grants. Our results suggest that donors and 
government agencies, important sources of capital for nonprofit organizations, react 
either directly or indirectly to internal control information. 


Keywords: internal control; nonprofit organizations; donors; government grants; 
Sarbanes-Oxley. 
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І. INTRODUCTION 
he nonprofit sector represents a sizable slice of the United States economy. Nonprofit 
| organizations had over $3.4 trillion in assets under their control and charitable giving to 
these organizations reached an estimated $295 billion, or 2.2 percent of gross domestic 
product, in 2006 (Wing et al. 2008). Several recent financial scandals have highlighted the sig- 
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nificant fiduciary responsibilities. of nonprofit managers as well as the relatively weak regulatory 
oversight of the nonprofit sector.’ As a result, lawmakers have increased calls for nonprofit orga- 
nizations to adopt more rigorous corporate governance practices, including improved internal 
control practices. 

Internal control audits are not new to the nonprofit sector. Nonprofit orzanizations that receive 
federal funding have been subject to reviews of internal control since 199). We make use of this 
unique setting to investigate the causes of internal control deficiencies ard perhaps, more inter- 
estingly, the consequences of internal control reporting for these organizations. Specifically, we 
examine the characteristics of public charities that report internal control problems and the effect 
of such problems on subsequent contributions and government grants received. 

Internal control is broadly defined as the process put in place by management to provide 
reasonable assurance regarding the achievement of effective and efficient operations, reliable 
financial reporting, and compliance with laws and regulations. Thus, results of internal control 
audits provide information on the level of risk that a nonprofit organization is not effectively 
catrying out its mission-related activities and fiduciary responsibilities. Ғаг this study, we define 
an internal control problem as the existence of a reportable condition over financial reporting or 
over compliance with federal program requirements. 

We first model the probability of disclosing an internal control problem as a function of 
salient characteristics of nonprofit organizations using a sample of 27 495 public charities from 
1999 to 2007. Our results generally suggest that nonprofit organizations fhat are more complex, 
financially distressed, smaller, and/or growing rapidly are more likely to disclose an internal 
contro] problem, consistent with prior research (Ge and McVay 2005; Keating et al. 2005; Doyle 
et al. 2007a; Ashbaugh-Skaife et al. 2007). 

Next, we consider the consequences of disclosure of an internal control problem for nonprofit 
organizations. Previous research into the consequences of an internal control deficiency focuses 








predominately on for-profit firms’ cost of equity capital, either directly оп 
market's response to the announcement of an internal control problem. 


indirectly through the 
The results from this 


research are mixed. Ashbaugh-Skaife et al. (2009) report that the disclosure of an internal control 


problem is associated with a higher cost of capital. However, using a 


Ogneva et al. (2007) find no relation between internal control deficiencies, 
capital. Furthermore, there is mixed evidence on the market response to inte 


with Section 302 disclosures generating negative abnormal returns but S 


different specification, 
and the cost of equity 
rnal control problems, 
ection 404 disclosures 


having no effect (Beneish et al. 2008; Hammersley et al. 2008). 

Nonprofit organizations do not issue shares and their missions are not to maximize profit. 
While nonprofit managers are not accountable to shareholders, they are accountable to donors and 
grantors who provide an important source of capital. These donors апа! grantors do not have 
limitless resources and, therefore, nonprofit organizations must compete for funding. If an orga- 
nization reports an internal control problem, donors could choose to support another organization 
where, presumably, the capital is used more efficiently. Therefore, disclosure of an internal control 
deficiency could result in lower subsequent contributions. Alternatively, unlike shareholders, do- 
nors do not ultimately benefit from a nonprofit organization's activities and! thus, аге less likely to 
monitor the organization (Fama and Jensen 1983). Some donors may be unaware of the problem 
or may not care about the problem and, therefore, the disclosure of an internal control deficiency 
may be unrelated to subsequent contributions. 





These scandals include the conviction of the CEO of the United Way of America for fraud, the Ponzi scheme perpetu- 
ated by the Baptist Foundation of Arizona, which an audit by Arthur Anderson failed to detećt, that resulted in the largest 
nonprofit bankruptcy ever; the embezzlement of funds from ACORN by the founder's brother; and the lavish spending 
of university money by the president of Oral Roberts University, to name a few. 


The Accounting Review January 2011 


American Accounting Association 








The Causes and Consequences of Internal Control Problems in Nonprofit Organizations 327 


We examine whether the disclosure of an internal control problem is associated with lower 
contributions received subsequently from donors using the Weisbrod and Dominguez (1986) 
model, which captures the responsiveness of donations to various economic factors. We use a 
two-stage estimation procedure to control for endogeneity between internal control problems and 
contributions received. Our results indicate that reportable conditions over financial reporting are 
negatively associated with future public support, even after controlling for the current level of 
public support and other drivers of contributions. Organizations that disclose internal control 
problems over financial reporting receive fewer contributions from individuals, corporations, and 
foundations in the subsequent year. 

Next, we investigate the effect of disclosing internal controls deficiencies on subsequent 
contributions received from local, state, and federal government agencies. Because audits are 
mandated by the federal government for recipients of federal funding and the results of these 
audits are filed with the Federal Audit Clearinghouse, governmental agencies likely use the infor- 
mation contained in the audit reports as one factor in funding decisions. Our results are consistent 
with expectations. We report negative associations between reportable conditions over both finan- 
cial reporting and federal program compliance and subsequent government contributions, after 
controlling for prior-year government contributions and political and economic determinants of 
governmental funding allocations. 

This study informs the debate over whether public charities should adopt more rigorous 
corporate governance practices, particularly in relation to internal control. Recently, policymakers 
have focused attention on the perceived lack of accountability and transparency by charitable 
organizations. This increased scrutiny is not necessarily unwarranted due to the recent financial 
scandals and the size of the nonprofit sector. Recognizing that they must maintain the public's 
trust, charities are working together to convince policymakers that they can address their sbort- 
comings without onerous regulations.” However, the nonprofit sector has not focused much atten- 
tion on the particular issue of internal control. Opponents of increased regulation argue that most 
donors do not use detailed financial information to make giving decisions and that nonprofits do 
not have the funds to comply with burdensome rules (e.g., Irvin 2005; Mulligan 2007). Our 
evidence suggests that the internal control information currently produced by a subset of organi- 
zations in the nonprofit sector does affect, either directly or indirectly, both donors’ and govern- 
ment agencies’ funding decisions. 

The results of this study should also interest nonprofit managers who make decisions about 
how to allocate scarce resources. During difficult economic times, when the demand for services 
is skyrocketing, it is essential that nonprofit organizations continue to attract donors and grantors. 
These organizations face tremendous pressure to focus resources on mission-related activities, 
However, Hager et al. (2004) argue that pressure from donors and watchdog groups to maximize 
mission-related spending and limit overhead costs to artificially low levels is detrimental in the 
long-run. Our results are consistent with the Hager et al. (2004) argument that underinvestment in 
administrative expenses (e.g., internal controls) can ultimately have negative consequences on 
mission-related activities. In particular, our evidence suggests that improving internal controls not 
only reduces the risk of monetary loss resulting from fraud or accounting error, but may also 
increase the organization’s ability to deliver services by attracting additional funding. 

Furthermore, this study contributes to the literature on the consequences of internal control 
reporting as it provides a more direct measure of the response to internal control problems. Prior 


? The most prominent example of self-regulation is the National Panel on the Nonprofit Sector convened by the Inde- 
pendent Sector. This panel proposed extensive changes in nonprofit governance and oversight in a June 2005 report to 
Congress, “Strengthening Transparency, Governance, and Accountability of Charitable Organizations.” 
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research examines the impact on the cost of equity capital, which could be considered a less direct 
measure of stakeholder response than donor contributions and government grants, in part because 
it is inferred from market models under some potentially strong assumptions. In this study, we 
measure stakebolder response to internal control problems by investigating the change in donor 
and government support. 

Finally, understanding the effects of disclosure of internal control |problems is important 
because auditors of nonprofit organizations adopted SAS No. 112, Commuinicating Internal Con- 
trol Matters Identified in an Audit, in 2007 and its successor SAS No. 115 of the same title in 
2009. These standards define the types of internal control deficiencies, provide detailed guidance 
on evaluating the severity of internal control deficiencies, and require auditors to communicate in 
writing to management and those charged with governance any deficiencies noted in an audit 
(Professional Standards, AICPA 2010, vol. 1, AU $352.01). As а resuli, these standards may 
influence public perception of nonprofit organizations. For example, PricewaterhouseCoopers 
(2006, 2) notes, “If an auditor identifies an internal control issue, it must be reported to trustees, 
granting agencies, and other regulators under new definitions and in a more public manner than 
before and, as a result, control deficiencies could be exposed to greater scrutiny by stakeholders.” 
Thus, consequences from reporting an internal control deficiency during the sample period аге 
likely to be amplified under today’s standards. 

The next section outlines current nonprofit regulatory oversight, with ап emphasis on internal 
control reporting. Section III presents our hypotheses and related empirical models. Section IV 
describes the sample selection procedures and data. Section V reports dur results. Section VI 
concludes. 


II. BACKGROUND 

The nonprofit sector is growing rapidly in size and complexity. Approximately 1.4 million 
nonprofit organizations operate in the United States today (Wing et al. 2008). These organizations 
vary significantly in terms of mission, size, and primary revenue source. Тһе Internal Revenue 
Code defines over 25 categories of nonprofits, such as human service organizations, schools, 
health care providers, cultural institutions, community development corporations, affordable hous- 
ing, and research laboratories. Nonprofits exist to provide a public benefit, and, therefore, receive 
preferential tax treatment and other regulatory privileges. Most nonprofit organizations are either 
public charities or private charitable foundations organized under Section 501(c)(3) of the Internal 
Revenue Code? Brown (2007) reports that, since 1997, the IRS has added to its master-file on 
average 39,465 exempt organizations per year, or 108 exempt organizations per day. 

Regulatory oversight has not kept pace with the growth in the number, of nonprofit organiza- 
tions. Currently there are two regulatory mechanisms by which most nonprofit organizations are 
monitored: (1) the IRS via the organization's tax return (Form 990), which is required for all 
organizations receiving at least $25,000 of public support, and (2) the nonprofit laws in the state 
of incorporation, which vary widely from state to state. These mechanisms have been criticized as 
insufficient to ensure that nonprofits meet their fiduciary obligations (Hansmann 1981; Atkinson 
1998; Fishman 2003; Reiser 2005). In fact, the IRS acknowledges a lack ой enforcement presence 
(Brown 2007). Other monitoring mechanisms do exist but vary by the type of organization and 





Private foundations generally receive funding from a single source (ie., а family or corporation), earn significant 
investment income, and make grants to other organizations. Public charities, as defined! in Section 509(a), receive 
substantial support from the general public or government and actively conduct charitable|operations. Private founda- 
tions are subject to various excise taxes and restrictions in order to ensure that they are using their resources for 
charitable purposes. Congress did not impose the same excise taxes and restrictions on public charities, presumably 


because donors hold public charities accountable. 
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type of funding sources (e.g., watchdog groups like the Better Business Bureau, program service 
contracts like Medicare, periodic program evaluations required by foundation and corporate do- 
nors, and state telemarketing reporting). 

Congress passed the Sarbanes-Oxley Act of 2002 in an attempt to improve the accountability 
and oversight of public companies. Most of the provisions of the Sarbanes-Oxley Act do not apply 
to nonprofit organizations and no federal equivalent of the Act currently exists for nonprofits.* 
Nevertheless, Sarbanes-Oxley influences attitudes about corporate governance in the nonprofit 
community (Ostrower 2007; Iyer and Watkins 2008). Policymakers at both the state and federal 
levels are considering various proposals aimed at enhancing nonprofit accountability (Ргешолі- 
Smith 2007). For example, Senator Charles Grassley (2006, 26), then Chairman of the Senate 
Finance committee, said: 

Just as Congress has acted in the public interest to protect shareholders and workers from corporate 

mismanagement, so too must Congress demand transparency, accountability, and good governance 

from the nonprofit sector... Tightening rules and regulations governing the nonprofit sector will 
help repair the breach of trust that threatens to tarnish even the most reputable charities in America. 


One of the main elements of Sarbanes-Oxley is management's responsibility for internal 
controls. Section 302 of the Act requires that chief executive and chief financial officers evaluate 
the design and effectiveness of internal controls on a quarterly basis and report an overall conclu- 
sion about the effectiveness of internal controls. Section 404 of the Act requires an annual audit of 
management's evaluation of internal controls and of the effectiveness of internal controls. Even 
though public charities are not subject to either Section 302 or Section 404 of Sarbanes-Oxley, 
similar requirements have been considered for the nonprofit sector. For example, the attorneys 
genera] in the states of New York and Massachusetts have proposed bills with provisions similar 
to the requirements in Section 302. 

Some charities already are required to undergo internal control evaluations annually because 
they receive federal funding. Specifically, all organizations with federal expenditures greater than 
$500,000 ($300,000 for fiscal years ending before January 1, 2004) must have an audit conducted 
in accordance with Office of Management and Budget (OMB) Circular A-133 "Audits of States, 
Local Governments and Non-Profit Organizations.” The results of these audits (Form SF-SAC) 
must be filed within nine months of the end of the fiscal year with the Federal Audit Clearinghouse 
and are publicly available. 

The objective of an A-133 audit, also called a single audit, is to provide assurance that an 
organization receiving grants from the federal government is using the funds appropriately and is 
complying with all federal regulations (AICPA 2009). As part of an A-133 audit, auditors issue 
opinions on both the financial statements and on compliance with the provisions of the federal 
contracts or grants. Іп addition, auditors report on internal control over both financial reporting and 
federal program compliance. The reports on internal control identify whether there are any re- 
portable conditions and, if so, whether any reportable conditions are material weaknesses." 


The two provisions of SOX that do explicitly apply to nonprofit organizations are whistle blower protection and 
document destruction policies. 

OMB Circular A-133 was issued in 1990 under the name "Audits of Institutions of Higher Education and Other 
Non-Profit Institutions" and revised in 1996, 2003, and 2007. Our sample includes observations before and after the 
2003 revision. The 2003 revision includes raising the audit threshold to $500,000 and technical changes related to the 
determination of cognizant agency (Federal Register 68 (June 27, 2003): 38401-38402). We include year controls in our 
empirical tests to control for any differences across time. 

Technically, auditors must test compliance requirements for “major programs." The determination of major programs 
takes into account risk, size, and oversight by a federal agency. See AICPA (2003) for more details. 

Reportable conditions involve deficiencies in the design or operation of internal controls that could adversely affect the 
organization's financial reporting or its ability to administer its federal programs. Material weaknesses are reportable 
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Circular A-133 establishes certain conditions for determining whether a nonprofit organiza- 
tion qualifies as a low-risk auditee. This determination is based on numerous factors including 
prior-year audit results, third-party references, the level of oversight of the granting federal 
agency, and the inherent risk of the federal programs involved. To be considered low-risk, an 
organization must have been audited annually for the past two years and these prior audits must 
have resulted in clean opinions, no internal control deficiencies, and no eudit findings. The risk 
determination affects the amount of auditing that is required to be performed under OMB A-133. 
For organizations not deemed low-risk, auditors are required to кшш more testing and, thus, are 
more likely to uncover internal control problems. 

The assessment of internal controls required by Circular A-133 is not|identical to the assess- 
ment of internal controls required by Sarbanes-Oxley. The A-133 audit is not overseen by the 
PCAOB, but rather is performed in accordance with Government Auditing Standards issued by the 
САО. As noted above, the scope of ап А-133 audit goes beyond financial statements to include the 
requirements of federal grants or contracts. Although the scope is wider, A-133 audits are gener- 
ally less stringent and less costly. Despite the differences from public company audits, A-133 
audits do provide information on the level of risk that a nonprofit organization is not effectively 
carrying out its mission-related activities and fiduciary responsibilities. 





HI. HYPOTHESIS DEVELOPMENT AND EMPIRICAL MODELS 
Determinants of Internal Control Deficiencies 


We first examine the determinants of internal control deficiencies. (А significant body of 
research examines the characteristics of publicly traded companies disclosing internal control 
problems. Ge and McVay (2005) find that firms disclosing material weaknesses are more complex, 
smaller, and less profitable than firms not disclosing material weaknesses. Coyle et al. (2007a) add 
that firms disclosing material weaknesses are younger, growing rapidly, or|undergoing restructur- 
ing. Likewise, Ashbaugh-Skaife et al. (2007) find that firms reporting internal control deficiencies 
have more complex operations, greater exposure to accounting risk, fewer resources to invest in 
internal control, and a higher likelihood of using a dominant auditor. 

Despite the extensive academic literature on internal controls in publicly traded companies, 
there is little research on internal control in the nonprofit sector. Keating|et al. (2005) examine 
A-133 audit results from 1997 to 1999 using univariate tests. They find that smaller organizations 
and organizations not classified as low-risk (1.е., new grantees, organizations with prior problems) 
disclose more internal control problems. They also report that organizations|with audits performed 
by national, large regional, and specialist firms report fewer internal control problems, which 
differs from the Ashbaugh-Skaife et al. (2007) auditor quality results for public companies. Keat- 
ing et al. (2005) suggest that small nonprofit organizations, which are more, likely to have internal 
control problems, select small audit firms. 

We extend Keating et al. (2005) by examining a more comprehensive зес of factors that may 
be associated with reporting internal contro] deficiencies in nonprofit organizations. Specifically, 
we model the likelihood of reporting internal control problems as a function of several internal 
control risk factors and audit detection variables. As discussed below, we expect organizations that 
are more complex, in poor financial health, smaller, new to federal furiding, and/or growing 
rapidly to disclose more internal control deficiencies. 








conditions in which the design or operation of internal controls does not reduce to a relatively low level the risk of 
material noncompliance with applicable grant requirements or with GAAP caused by error Dr fraud that may occur and 
not be detected in a timely manner (AICPA 2003, 104). OMB revised Circular A-133 in Jun 2007 to be consistent with 
SAS No. 112, which replaced the "reportable condition" concept with "significant deficiency." Because our sample 
period pre-dates SAS No. 112, we use the term reportable condition. 
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Public charities with diverse operations face challenges instituting internal controls across 
their various initiatives and divisions. We measure organizational complexity using the number of 
revenue sources (public support, government support, and/or program service revenue.) Organi- 
zations that receive funding from only one source generally engage in fewer types of charitable 
programs than organizations that receive funding from several sources. We predict that organiza- 
tions with more sources of funding (COMPLEXITY) are more likely to have a variety of opera- 
tions and, therefore, more likely to report an internal control problem. 

Nonprofit organizations in poor financial health are less likely to have resources to invest in 
establishing strong internal controls. We use the existence of a going-concern paragraph in the 
opinion on the financial statements (GOINGCONCERNRISK) as a proxy for poor financial health. 
A going-concern paragraph indicates that the auditor has substantial doubt whether the organiza- 
tion can meet its obligations as they become due. We expect that organizations with a going- 
concern paragraph report more internal control deficiencies. Consistent with studies of public 
companies that use the existence of losses to measure financial health, we also include an indicator 
of whether the organization's revenues exceed its expenses (SURPLUS). We expect that charities 
with a surplus have fewer internal control problems. 

Larger organizations (SIZE), as measured by total assets, have more resources and experience 
to draw on when implementing internal controls. For example, Greenlee et al. (2007) report that 
older and larger nonprofit organizations are more likely to have an internal audit function in place. 
Thus, we expect that smaller organizations disclose more internal control problems. Internal con- 
trols should change in response to organizational change as existing controls may be irrelevant or 
inefficient and new controls may be required. Rapidly growing organizations are often unable to 
adequately assess and update internal controls at the same pace at which organizational expansion 
occurs. We predict that change in size (GROWTH) is positively associated with the existence of 
internal control deficiencies. Similarly, organizations that receive federal funding for the first time 
(NEWGRANTEE) are less likely to have all of the internal controls systems in place to meet 
federal requirements. 

We also investigate the effect of auditor type on the probability of reporting an internal 
control problem but do not make a prediction. On one hand, dominant audit firms (BIG4, RE- 
GIONAL, and SPECIALIST) have more training, experience, and exposure to litigation risk, all of 
which imply that these audit firms are more likely to discover internal control deficiencies. On the 
other hand, dominant audit firms may only contract with prestigious nonprofit organizations that 
are inherently less risky. This self-selection suggests that dominant audit firms are less likely to 
discover internal control problems at their nonprofit clients. In fact, Kitching (2009) finds evidence 
that donors behave as if dominant auditors are a signal of credibility. 

As noted in Section II, auditors are required to determine whether a nonprofit organization 
qualifies as a low-risk auditee under OMB A-133. Because they are inherently less risky and 
because there is less testing involved, the likelihood of detecting an internal control problem is 
lower for low-risk auditees than it is for other auditees. Thus, we include an indicator variable if 
the organization is not deemed low-risk (RISK) as an important control in our model. 

Based on the above discussion, we estimate the probability of disclosing an internal control 
deficiency as a function of organizational characteristics and audit detection variables as follows: 


Prob(ICD) = By + BL|COMPLEXITY + B3GOINGCONCERNRISK + BSURPLUS + B4InSIZE 
+ B;GROWTH + BsRISK + BjNEWGRANTEE + B3BIG4 + B9REGIONAL 
+ BioSPECIALIST + X yINDUSTRY + X YEAR. (1) 


Overall, we expect that the likelihood of reporting an internal control problem increases as a 
function of COMPLEXITY, GOINGCONCERNRISK, GROWTH, RISK, and NEWGRANTEE and 
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decreases as a function of SURPLUS and SIZE. The empirical specification also includes controls 
for industry and year. 


Effect of Internal Control Deficiencies 


We next examine the consequences of internal control deficiencies. Prior studies of public 
companies document that internal control problems are associated with equity market concerns. 
Specifically, firms reporting an internal control deficiency under Section 302 experience stock 
price declines, with the most negative returns for material weakness disclosures (Hammersley et 
al. 2008; Beneish et al. 2008). Evidence on the impact of internal control problems on the cost of 
capital for public companies is mixed. Ashbaugh-Skaife et al. (2009) find that internal control 
problems are associated with a higher cost of equity, while Ogneva et al. (2007), using a different 
specification, do not find an association. A limitation of these studies of public companies is that 
the cost of equity is a less direct measure of stakeholders’ reactions, as it is|inferred from a market 
model under certain strong assumptions. 

There has been little consideration given to understanding the effect of internal control defi- 
ciencies in the nonprofit sector. Internal controls are established to provide assurance that opera- 

„tions are running efficiently and that financial reporting is reliable. We expect that nonprofit 
organizations with internal control problems have lower operating efficiency and produce lower 
quality financial reports, on average. Thus, internal control problems can influence directly or 
indirectly the amount of funds available to achieve the organization’s mission. The source of 
funding takes many forms depending on the type of organization, including donor contributions, 
government grants, program service revenue, and/or debt financing. As discussed below, we ex- 
amine the impact of internal control problems on donations (PUBLIC SUPPORT) and government 
grants (GOV CONTRIBUTIONS). Unlike the indirect cost of capital measure for public companies 
noted above, contributions by donors and government agencies to nonprofit organizations provide 
direct evidence of stakeholder reactions to internal control problems. 


Public Support 


Public support includes gifts received from individuals, trusts and estates, corporations, and 
foundations (DIRECT SUPPORT), as well as gifts received from federated fundraising agencies 
(INDIRECT SUPPORT), such as the United Way and the Combined Federal Campaign. Donors 
generally have less information about the quality of the nonprofit organization’s output relative to 
government grantors, customers (who provide program service revenue), and creditors. Neverthe- 
less, donors provide a substantial amount of support to the nonprofit sec;or. In the face of this 
information asymmetry, it is important to understand all factors, including|the quality of internal 
control, that influence a donor’s charitable giving decision in a competitive market for donations. 

Several prior studies offer evidence that a public charity’s operating efficiency is positively 
associated with the amount of donor support received (e.g., Weisbrod and Dominguez 1986; 
Posnet and Sandler 1989; Greenlee and Brown 1999; Tinkelman 2004; Tinkelman and Mankaney 
2007). Further, Yetman (2008) reports that donors give less to organizations that overstate 
mission-related expenses and understate fundraising expenses, providing some support for the idea 
that donors can unravel low-quality financial statements. However, Tinkelman (1998) and Khu- 
mawala et al. (2005) both provide evidence that most donors do not unravel joint cost allocations 
made to strategically overstate mission-related activities. Overall, prior research suggests that, in 


8 ‘There are many slight variations in the definition of operating efficiency. In general, these operating efficiency measures 
attempt to capture how much the nonprofit organization spends on program-related activities (i-e., fulfilling its mission) 
relative to how much it spends on administrative and fundraising costs. 
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many but not all cases, donors use available information from the organization's Form 990 to 
distinguish higher quality nonprofit organizations from lower quality nonprofit organizations and 
make their giving decisions accordingly. 

Because internal control deficiencies can signal a lack of effectiveness in providing charitable 
services and a higher probability of undetected misconduct, all else equal, we expect that nonprofit 
organizations with internal control deficiencies receive fewer subsequent contributions from the 
public than organizations with no internal control deficiencies. This hypothesis is based on the 
assumption that donors make giving decisions in order to assist in the provision of public goods 
and, thus, opt to give to organizations that can provide the public goods with minimum risks. 
While it is likely that some financially sophisticated donors (e.g., private foundations) actually 
obtain the publicly available A-133 audit report as part of the giving decision process, it is highly 
unlikely that all donors do. Even if donors do not directly learn that an organization has reported 
an internal control problem from the A-133 report, they may still indirectly receive information 
about that problem. For example, internal control problems can be associated with lower operating 
efficiency, which is observable on the more widely distributed Form 990. Alternatively, a donor 
may have lower quality interactions with an organization that has internal control problems (e.g., 
an internal control weakness causes donor acknowledgments not to be sent as required by the 
IRS). 

There are reasons why the quality of an organization's internal controls may not affect public 
support. In particular, not all donors give in order to provide a public good. Some donors simply 
seek a warm glow (Andreoni 1990; Ribar and Wilhelm 2002) and, thus, internal control informa- 
tion, or any financial information for that matter, is irrelevant. Also, it may be too costly for donors 
to obtain and evaluate A-133 audit information.” Finally, if the internal control audit results are not 
filed until nine months after year-end, the information may be stale. Therefore, it is an open 
empirical question as to whether internal control problems affect subsequent contributions. 

We adapt the widely used Weisbrod and Dominguez (1986) approach to capture the respon- 
siveness of donations to various economic factors. Weisbrod and Dominguez (1986) model public 
support as a function of conventional market variables, including price, fundraising expenses, and 
age: 


InPUBLIC SUPPORT, = Ву + BInFUNDRAISING EXP, + ВпРКІСЕ, , + B3AGE,. 


PRICE measures the cost to the donor of "purchasing" (i.e., contributing) one more dollar of the 
organization's charitable output. PRICE depends on the after-tax cost of giving, as well as the 
efficiency by which the organization generates output. Specifically, PRICE is defined as: 


E 1 (5 UNDRAISING EXP + ADMINISTRATIVE EXP | | 


TOTAL EXPENSE 


Donors face the same marginal tax rate with respect to donations for all charitable organizations 
and, thus, we assume T = 0. Note that the denominator is equivalent to the program expense ratio 
(Program Expenses/Total Expenses) so when Т = 0, PRICE equals the inverse of the program 
expense ratio (Total Expenses/Program Expenses).!° Theoretically, price should have a negative 


This discussion suggests that any influence that internal control does have on public support is moderated by the level 
of sophistication of the organization’s donor clientele. Unfortunately, it is impossible to test this supposition using 
archival data because nonprofit organizations do not disclose the identities of their donors in a consistent, systematic 
manner. See Baber et al. (2001) for a discussion of donor clienteles. 

10 When defining PRICE, some studies scale fundraising and administrative expenses by public support (e.g., Weisbrod 
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influence on the level of giving. However, Bowman (2006) notes that, іп|ргіог empirical studies, 
results of tests examining the effect of price on public support are sensitive to model specification. 
The Weisbrod and Dominguez (1986) model also includes FUNDRAISING EXP, which represents 
the organization's effort to reduce information asymmetry, and AGE, which represents the orga- 
nization's stock of goodwill. Both are expected to positively affect public support. 

In order to test our expectation about the effect of internal control problems on donor contri- 
butions, we estimate the following equation: 


InPUBLIC SUPPORT, = By + B.INTERNAL CONTROL DEFICIENCY, 1 
+ ByInFUNDRAISING EXP, + BjInPRICE,.., + ВАСЕ, 
+ BsInGOV CONTRIBUTIONS, 1 + В ПРКОСЕАМ REVENUE, у 
+ ВИПРИВЫС SUPPORT, | + X INDUSTRY + © &YEAR. Q) 


In addition to the Weisbrod and Dominguez (1986) variables, we include góvernment grants (GOV 
CONTRIBUTIONS) and program service revenue (PROGRAM REVENUE) in order to control for 
any crowding-out or crowding-in effects. Khanna and Sandler (2000), аға Okten and Weisbrod 


(2000) provide evidence of a positive relation between public support and 


government grants and 


program service revenue, indicating a crowding-in effect.!! Finally, we include prior-year public 


support to capture any other organization-specific factors, as well as indt 


We are primarily interested in the coefficient on internal control deficiency, 


stry and year controls. 
P, and predict that the 


existence of an internal control problem is negatively associated with subsequent public support. 2 


Government Contributions 

Government contributions represent gifts and grants from fedéral, state, and local 
governments. 9 Similar to our expectation for public support, we hypothesize that nonprofit orga- 
nizations with internal control deficiencies receive fewer government contributions than organiza- 
tions with no internal control deficiencies. Given that the federal government mandates internal 
control reporting as part of the required A-133 audit, all else equal, the federal government should 
use this internal control information to make funding decisions. 

Even though our prediction of a negative association between internal control problems and 
subsequent government contributions seems intuitive, there are questions about the actual deter- 





and Dominguez 1986; Okten and Weisbrod 2000; Bowman 2006), while others scale by total expenses (e.g., Posnett and 
Sandler 1989; Tinkelman 1998; Khanna and Sandler 2000). That is, some studies measure the amount of charitable 
output relative to revenue received and some measure charitable output relative to total expenses. Our results are not 
affected by the choice of scale. We choose to present the results scaling by total expenses for а practical reason. In many 
instances, the sum of fundraising and administrative expenses exceeds public support. In these instances, log(PRICE) is 
undefined. Thus, scaling by public support limits the number of usable observations. 
Crowding-out occurs when an increase in government support discourages donors from ircreasing their own contribu- 


tions because need is already being met by government support. Crowding-in occurs ed support encour- 





ages donors to increase their own giving, often because government grants enhance the reputation of the nonprofit 
organization and because government grants are accompanied by more monitoring that provides additional assurance to 
donors that their funds are being used appropriately. 
We examine the association between internal control problems and public support received їп the following year. To the 
extent that information from internal control audits is not available until nine months after (ле fiscal year-end and donors 
directly use this information, our tests are biased against finding results. Our approach is;consistent with prior studies 
that examine the influence of program-spending ratios on subsequent years’ giving using 990 data, where 990s are 
generally filed from 5 to 11 months after year-end. 

3 Tt is important to note that government contributions оп IRS Form 990 are distinct from government contracts or 
payments for service, which are included in program service revenue (e.g., Medicare payments received by a hospital 
are not classified as government contributions). Thus, it is common for organizations to receive no government contri- 
butions but still qualify for the A-133 audit because they earn revenue from federal contracts. 
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minants of government funding in the nonprofit sector. Jt is possible that political factors, and not 
the quality of the nonprofit organization as signaled by the internal control audit results, drive 
government contribution decisions. For example, the New York Attorney General recently initiated 
a probe into “pay-to-play” campaign donations made by nonprofit organizations to politicians in 
order for the nonprofit organizations to receive government grants (Dicker and Goldenberg 2009). 
In addition, government contributions include funds received from state and local governments, 
which may be less likely to use the A-133 audit report than the federal government." Finally, the 
federal government comprises a wide variety of federal agencies. There are 55 different federal 
agencies with oversight responsibilities in our sample, ranging from the CIA to the Peace Corps 
and from the National Science Foundation to the National Endowment for the Arts. These 
agencies have different missions and likely use internal control information differently. For these 
reasons, we empirically examine the link between internal control weaknesses and subsequent 
government grants. 

We estimate a model of government contributions as a function of internal control problems 
and political and socio-economic factors, as follows: 


InGOV CONTRIBUTIONS, = By + B.INTERNAL CONTROL DEFICIENCY, | 
+ B3InFUNDRAISING EXP, | + BLOBBYING, + АСЕ, 
+ BsInPRICE, | + BelnPUBLIC SUPPORT, | 
+ BylnPROGRAM REVENUE, | 
+ BglnGOV CONTRIBUTIONS, + В СОР, + Xa;STATE 
+ Zy INDUSTRY. (3) 


In order to address the political determinants of government funding, we include LOBBYING, 
which is an indicator variable that designates whether the organization incurred expenditures to 
influence legislation through communication with legislators or government officials. While 
-lobbying may not, per se, result in more government funding, engaging in lobbying activities 
signifies a politically savvy nonprofit organization. Thus, we expect a positive association between 
lobbying and government contributions. 

We include annual gross domestic product (GDP) to control for economic conditions." Gov- 
ernments should have a greater supply of funds available for gifts and grants when GDP is higher. 
Alternatively, the demand for government grants is higher during periods of low GDP. Thus, we 
make no predictions about the coefficient on GDP. 

We also use state indicator variables as proxies for demand for government funding. One 
objective of government contributions is to redistribute revenue to geographic areas with eco- 


14 Different states and municipalities have different levels of monitoring of their grant programs. It is not possible to 
identify the specific source(s) of government contributions from the Form 990. The largest portion of government 
contributions for most nonprofits comes from the federal government, although large, complex organizations receive 
contributions from numerous federal, state, and local agencies. 

Nonprofit organizations expending more than $50 million ($25 million for fiscal years ending before January 1, 2004) 

in federal awards are assigned a cognizant agency. All other nonprofits are assigned an oversight agency. Generally, the 

cognizant or oversight agency is the federal agency that provides the predominant amount of federal funding. The 
purpose of the cognizant or oversight agency is to provide technical audit advice and act as a liaison between the 
nonprofit organization and other federal agencies with respect to audit issues. 

16 Nonprofit organizations are not allowed to make political campaign donations but are allowed to engage in some 
lobbying activities, subject to certain limitations, without risking their tax-exempt status. See Treasury Regulations 
Section 56.4911 for more details. 

И Because we measure GDP on an annual basis, we cannot concurrently include year controls in the model. 
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nomic need. In some cases, the level of funding is established mathematically by the population 
served by the nonprofit organization (e.g., need-based formula grants). While nonprofit organiza- 
tions do not operate exclusively in one geographic region, the state in which these organizations 
are headquartered can generally reflect characteristics of the populations that they serve (i.e., level 
of poverty). State indicator variables may also control for political factors (e.g., state representa- 
tion on Congressional appropriation committees). 

Andreoni and Payne (2003) find that nonprofit organizations reduce fundraising efforts when 
they receive government grants. If a nonprofit organization determines that government support 
will be slashed in the next period, either because the organization has fallen out of political favor 
or because of macroeconomic constraints, the. nonprofit organization тау increase fundraising. 
Thus, we expect a negative coefficient on FUNDRAISING EXP. 

Generally, individual donors use PRICE as a measure of operating efficiency, while govern- 
ments have more direct methods of monitoring an organization's efficiency (Khanna and Sandler 
2000). Nevertheless, some government agencies, particularly at the state and local level, could use 
basic operating efficiency ratios to make decisions and/or PRICE could serve as a proxy for the 
more complex efficiency measures actually used by government agencies. Thus, we include 
PRICE, as defined in the previous section, in Equation (3). We also incluile AGE as an indicator 
of reputation, PUBLIC SUPPORT and PROGRAM REVENUE as controls|for any crowding-in or 
crowding-out effects, prior-year GOV CONTRIBUTIONS to capture any other organization- 
specific factors, and industry controls. As with Equation (2), we are primarily interested in the 
coefficient on internal control deficiency, В}, and expect that the disclosure of an internal control 
problem is negatively associated with subsequent government contributions. 

A final note—reporting a negative coefficient on В; in either Equation (2) or Equation (3) may 
indicate that disclosure of internal control deficiencies influences giving decisions. However, it is 
also possible that low levels of contributions result in inadequate resources necessary for a non- 
profit organization to implement strong controls. To address endogeneity concerns, we implement 
the Heckman (1979) selection model. In the first stage, we use Equation (1) to estimate the 
likelihood of reporting an internal control deficiency and, using the parameters of this model, 
compute an inverse Mills ratio. In the second stage, we estimate Equation (2) and Equation (3) 
with the inverse Mills ratio as a control. 


IV. SAMPLE SELECTION AND DATA 

We obtain data on public charities from two sources: (1) the A-133| Single Audit database 
available from the Federal Audit Clearinghouse and (2) the IRS Form 590 databases available 
from the National Center for Charitable Statistics (NCCS). The A-133 data include general auditee 
information, the amount of federal awards expended, auditor name, type oí audit performed, audit 
opinions, internal control information, and audit findings as reported on the Form SF- SAC. The 
IRS data include revenues, expenses, and balance sheet data as reported on the Form 990. 19 АП 
variables that we use from these databases аге defined in Table 1. 





18 Information on internal controls over major programs is not available in the electronic Single Audit database until 2001 
when the federal government changed the format of Form SF-SAC. Information оп internal controls over financial 
reporting is available for all years. 
We use data from several different Form 990 databases. The Core Trend v2009a file provitles organizational character- 
istics and basic financial statement data from 1998-2007. The DD Revenues and Expenses v2005 file provides a 
detailed breakdown of revenues and expense categories from 1998-2003. The 501 file provides a detailed breakdown of 
revenue and expense categories from 2004—2007 for a stratified random sample of firms selected by the IRS. The Core 
Supplemental v2009 file provides a detailed breakdown of revenues and expenses categories for 2004—2006 for orga- 
nizations not covered in the SOI file. Note our sample does not include detailed data for the universe of organizations 
for 2007 but rather only for the sample in the SOI file. Our results are consistent when we exclude 2007 from our 
analysis. 
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Variable 
RC. FS 


RC GOV 


RC ANY 
PUBLIC SUPPORT 


DIRECT SUPPORT 
INDIRECT SUPPORT 


PROGRAM REVENUE 
GOV CONTRIBUTIONS 
FUNDRAISING EXP 
COMPLEXITY 
GOINGCONCERNRISK 
SURPLUS 
NEWGRANTEE 

AGE 

SIZE 

GROWTH 

RISK 


BIG4 


REGIONAL 


The Accounting Review 


Variable Definitions 


— An indicator variable that equals 1 if the A-133 audit noted reportable 
conditions in internal controls over financial reporting (SF-SAC Part 
П 23); otherwise 0. 


= An indicator variable that equals 1 if the A-133 audit noted reportable 
conditions in internal controls over major programs (SF-SAC Part Ш 
155); otherwise 0. 


= An indicator variable that equals 1 if КС FS equals 1 or RC GOV 
equals 1; otherwise 0. 


— 'Total public support received for the fiscal year, defined as the sum of 
direct support and indirect support. 


= Direct public support received for the fiscal year (Form 990 Line 1a). 
= 1 public support received for the fiscal year (Form 990 Line 
1 
— Program service revenue, including government fees and contracts, 
received for the fiscal year (Form 990 Line 2). 


— Government contributions (grants) received for the fiscal year (Form 


— Fundraising expenses for the fiscal year (Form 990 Line 15). 


— Number of revenue sources included on Form 990 from 0-3 
(PUBLIC SUPPORT, GOV CONTRIBUTIONS, and/or PROGRAM 


— An indicator variable that equals 1 if the A-133 audit includes a 
going-concern explanation (SF-SAC Part II 42); otherwise 0. 


= An indicator variable that equals 1 if total revenues (Form 990 Line 
12) — total expenses (Form 990 Line 17) 70; otherwise 0. 


= An indicator variable that equals 1 if the current year is the first year 
an organization expends federal contributions; otherwise 0. 


= Number of years the organization has been tax-exempt (IRS 


= Beginning-of-year total assets (Form 990 Line 59a). 


— The growth in assets, measured as the ratio of end-of-year total 
Assets (Form 990 Line 59b) to beginning-of-year total Assets (Form 


= An indicator variable that equals 1 if organization is classified as “not 
low risk" on the A-133 audit (SF-SAC Part III 43); otherwise 0. 


= An indicator variable that equals 1 if auditor (SF-SAC Part I #7) of 
the А-133 report is classified as one of the Big 6 auditors; otherwise 


= An indicator variable that equals 1 if auditor (SF-SAC Part I #7) of 
А-133 report is classified as one of the Regional auditors; other- 


(continued on next page) 


January 2011 


American Accounting Association 


338 Petroyits, Shakespeare, and Shih 


TABLE 1 (continued) 


Variable Definition 

SPECIALIST = An indicator variable that equals 1 if auditor (SF-SAC Part I #7) of 
the A-133 report is classified as one of the Specialist auditors; 
otherwise 0. 

PRICE = Total Expenses/Program Service Expense, where Total Expenses is 


the sum of fundraising expenses (Form 990 Line 15), management 
and general expenses (Form 990 Line 14), and program services 
expenses (Form 990 Line 13). 


LOBBYING - An indicator variable that equals 1 if organization reports lobbying 
expenditures to directly influence a legislative body (Form 990 
Schedule A Line 37b); otherwise 0. 


Source: IRS Form 990 from the National Center for Charitable Statistics and Form SF-SAC from the Federal Audit 
Clearinghouse. 


Table 2, Panel A details the sample selection process. А merge of the А-133 data and the IRS 
Core Trend data on BIN and year results in a sample of 127,988 observations (27,495 public 
charities) from 1999 to 2007." We use the Full Sample to shed light jon the determinants of 
internal control weaknesses for a broad cross-section of organizations. 

As discussed in the previous section, the second stage of our tests focuses on the influence of 
internal control problems on subsequent public support and government contributions. Some pub- 
lic charities receive only an immaterial amount of public support and/or government contributions 
(e.g., low-income housing projects). Thus, for the second stage, we further limit our sample to the 
subsets of organizations where public support or government contributions represent a nontrivial 
source of revenue. Specifically, following Tinkelman and Mankaney (2007), we eliminate obser- 
vations with public support (government contributions) of less than $100, 000. This process results 
in a Public Support Sample of 47,318 observations (12,342 public charities) and a Government 
Sample of 65,415 observations (16,369 public charities). These limited samples are still signifi- 
cantly larger than samples in previous studies of internal control in the for-profit literature. Lack of 
necessary data further reduces sample size for specific tests. 

In Table 2, Panel B, we classify observations into five main industries based on the National 
Taxonomy of Exempt Entities (NTEE) developed by the IRS. The five industries, which are the 
same industries used in Keating et al. (2005), include: Arts, Education, Health, Human Services, 
and Public Benefit. The remaining NTEE categories (1.е., Religion, International, Environment, 
and Unknown) are classified as “Other.” Human Services organizations (e.g., Red Cross chapters, 
YMCAs) comprise approximately half of the sample, while Arts and Cultural organizations com- 
prise the smallest fraction of the sample. Untabulated results indicate that, although Human Ser- 
vices are the most common type of nonprofit organizations receiving federal awards, these orga- 
nizations are also the smallest as measured by total assets. Educaticnal institutions, which 





20 Because the NCCS data contain some data errors we conduct the error-checking procedures recommended by the 
NCCS. We noted some organizations that had identical information in consecutive years. We could not universally 
determine which year contained the correct information and, thus, deleted all related years. This resulted in a loss of 
9,735 observations. Also, as suggested by the 2006 Guide to Using NCCS Data, any Suspicious Observations were 
compared to full text versions of the Form 990 available at Guidestar (http://www. guidestar. org). A small number of 
corrections were made, primarily related to the units reported (i.e., the file listed $5 instead of $5 million). To our 


knowledge, any remaining errors create noise but do not systematically bias our tests. 
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TABLE 2 
Sample Description 
Panel A: Sample Selection | 








Full Public Support Government 
Sample Sample Sample 
Public operating charities reporting to IRS 2,261,486 
Organizations receiving A-133 audit 337,353 
Merge IRS and A-133 data 129,356 
Less audit periods other than “Annual” (1,368) 
Full Sample 127,988 
Less public (government) support,., < $100,000 (75,389) (54,384) 
Less no public (government) support, data (5,281) (8,189) 
Total Observations 127,988 47,318 65,415 
Unique Organizations 27,495 12,342 16,369 
Panel В: Observations by NTEE Classification 
Full Public Support Government 
Arts 1,912 1,136 1,268 
Education 15,574 9,435 9,688 
Health 26,397 10,397 14,935 
Human Services 69,320 20,788 30,667 
Public Benefit 10,923 3,745 6,761 
Other 3,862 1,817 2,096 
Total Observations 127,988 47,318 65,415 
Panel C: Observations by Auditor Type 
А Full Public Support Government 
Big 4 11,254 7,370 6,690 
Regional 11,467 5,225 6,051 
Specialist 18,768 6,042 8,493 
Other 86,499 28,681 44,181 
Total Observations 127,988 47,318 65,415 


NTEE classifications: Arts (Major Group A), Education (Major Group B), Health (Major Groups E, F, G, H), Human 
Services (Major Group I, J, K, L, M, N, O, P), and Public Benefit (Major Groups R, S, T, U, V, W). 


comprise approximately 12 percent of the Full Sample and 20 percent of the Public Support 
Sample, are significantly larger than other types of nonprofit organizations. In Table 2, Panel C, we 
classify observations by the type of auditor, which is based on audit firm size and experience in 
conducting А-133 audits, similar to Keating et al. (2005). The Big 4 category includes the largest 
public accounting firms during the sample репод 7! The Regional category includes any of the 
next 25 largest accounting firms from Accounting Today's 2004 Тор 100 Firms list (ranked by 


21 This category includes any nonprofit organization audited by Deloitte & Touche, Ernst & Young, KPMG, Pricewater- 
houseCoopers (Coopers & Lybrand or Price Waterhouse), or Arthur Andersen. 
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revenues). The Specialist category includes accounting firms, not already classified as Big 4 or 
Regional, that conducted 100 or more A-133 audits during the sample period. The Other category 
includes all accounting firms not already classified as Big 4, Regional, or Specialist. The Special- 
ists conducted approximately 15 percent of all A-133 audits in our sample, while the Big 4 and 
Regional firms each conducted 9 percent of the audits.” Organizations receiving at least $100,000 
of public support are more likely to receive an audit by a Big 4 firm (15 percent of the Public 
Support Sample). In fact, untabulated results indicate that Big 4 firms audit the largest nonprofit 
organizations receiving federal funds. Using the Full Sample, the mean |total assets of a Big 4 
auditee is $405.9 million, while the mean total assets of a Regional auditee is $29.5 million. 
Regional firms, in turn, audit larger organizations than Specialists, which, in turn, audit larger 
organizations than the other accounting firms. 
Table 3, Panel A provides descriptive statistics for the three samples. For the Full Sample, the 
mean (median) SIZE is $45.8 million ($2.4 million). The Public Support Sample is substantially 
larger, with a mean (median) SIZE of $101.6 million ($5.1 million). Across all three samples, - 
organizations are relatively mature—the mean age for the Full Sample is 25 years. The median 
INDIRECT SUPPORT is $0 for all samples, indicating that many organizations do not receive any 
indirect support from federated fundraising campaigns. Note all of the continuous variables except 
AGE and GROWTH are right-skewed. Thus, we use natural log transformations for these variables 
in our analysis. 
During our sample period, the A-133 audit provided four indicators of an internal control 
problem: (1) if the organization discloses a reportable condition related to financial reporting 
(RC_FS); (2) if any reportable condition related to financial reporting constitutes a material weak- 
ness (MW. FS); (3) if the organization discloses a reportable condition related to compliance with 
federal program requirements (RC_GOV); and (4) if any reportable condition related to federal 
program compliance constitutes a material weakness (MW GOV). Material weaknesses are a 
subset of reportable conditions, representing the more severe internal control problems. In addition 
to theses four indicators, we also create a variable, RC ANY (MW ANY), which denotes the 
existence of a reportable condition (material weakness) over either financial reporting or federal 
program compliance. i 
Table 3, Panel B reports the frequency of each type of internal control problem. Given the 
relatively small number of observations with a material weakness, we focus primarily on report- 
able conditions. Overall, 14.86 percent of the sample discloses a reportabie condition over finan- 
cial reporting and 13.76 percent discloses a reportable condition of federal program compliance. 
Not surprisingly, there is significant overlap among organizations disclosing a financial statement 
internal control problem and organizations disclosing a federal program internal control problem. 
For comparison purposes, 14—15 percent of for-profit companies report a material weakness 
(Doyle et al. 2007a, Table 5). 


V. RESULTS 

Determinants of Internal Control Deficiencies 
Table 4 reports simple correlations between our measures of internal control problems and 
organizational characteristics. As predicted, a going-concern paragraph in the audit opinion is 
positively associated with internal control deficiencies, while reporting à surplus is negatively 
associated with internal control deficiencies. Disclosure of internal control problems is positively 
associated with GROWTH, RISK, and NEWGRANTEE as expected. SIZE їз negatively associated 


22 These frequencies are quite different from the frequencies in the for-profit sector. For example, Ashbaugh-Skaife et al. 
(2007) report that the six dominant audit suppliers account for 84.7 percent of the audits of public companies. 
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with RC. FS. The correlation between COMPLEXITY and the reportable condition measures is 
unexpectedly negative, which could be because COMPLEXITY is correlated with SIZE. For each 
of the empirical models estimated in the subsequent tables, we calculate variance inflation factors 
using ordinary least-squares and determine that multicollinearity is not a significant concern. 

Table 5 presents the results from the first stage of our analysis. In Panel A, we estimate the 
probability of disclosing a reportable condition over financial reporting (RC. FS) as a function of 
organizational characteristics and audit detection factors, using a probit model for the three 
samples. The coefficients on COMPLEXITY, financial health (GOINGCONCERNRISK and SUR- 
PLUS), SIZE, GROWTH, and RISK have the predicted signs and are statistically significant. The 
coefficient on NEWGRANTEE is unexpectedly negative and significant. This negative coefficient 
results from the bigh association between NEWGRANTEE and RISK. When RISK is removed from 
the model, the coefficient on NEWGRANTEE is significantly positive. 

In Table 5, Panel B, we estimate the probability of disclosing a reportable condition over 
either financial reporting or federal program compliance (RC ANY). The results are similar to 
those in Panel A, except for the coefficients on COMPLEXITY in the Government Sample and 
SIZE in the Public Support and Government Samples. Overall, our evidence is consistent with the 
notion that less financially healthy and growing organizations disclose more internal control de- 
ficiencies. In addition, we provide some evidence that smaller and more complex organizations 
disclose more internal control problems. 

The model in Table 5 also includes indicator variables for the type of auditor performing the 
А-133 audit. The coefficient on Big 4 auditors is reliably negative and significant, consistent with 
Keating et al. (2005). This result indicates that the probability of disclosing an internal control 
problem decreases if a Big 4 firm is used, and suggests that these audit firms selectively contract 
with certain high-quality nonprofit organizations. However, the coefficient on Regional firms is 
significantly positive, which indicates the likelihood of disclosing an internal control problem 
increases if a Regional audit firm is used. 

We also estimate the probability of disclosing a reportable condition over federal program 
compliance (RC, GOV). The untabulated results are generally consistent with those in Table 5, 
with some minor exceptions. For the Full Sample, the coefficient on COMPLEXITY is not signifi- 
cant and for the Public Support Sample, the coefficients on SIZE and GROWTH are not significant. 
In addition, we estimate the probability of disclosing a material weakness (MW. FS and 
MW. ANY). Again, the results are generally consistent with those in Table 5. We find that all 
coefficients are statistically significant with the predicted sign, except for the coefficient on 
GROWTH, which is positive but not significant. Finally, we include the lagged internal control 
problem (either RC, Е, у or RC ANY, |) and exclude NEWGRANTEE (because new grantees do 
not have prior-year internal control data). Not surprisingly, the coefficient on the lagged internal 
control problems is significantly positive. That is, if an organization reported an internal control 
weakness in the prior year, then the organization is more likely to report an internal control 
weakness in the current year. 


Effect of Internal Control Deficiencies on Public Support 


Table 6, Panel A reports the results from the second stage of our analysis. We estimate a 
regression of public support on the disclosure of reportable condition over financial reporting in 
the prior year and include the Inverse Mills ratio computed using the parameters from Table 5 for 
the Public Support Sample. For this table and all subsequent tables, we use Huber-White robust 
standard errors, where errors are clustered by organization. 

In the first column (“Ваве”) of Table 6, Panel A, we estimate the traditional Weisbrod and 
Dominguez (1986) model. Consistent with prior research, the coefficients on FUNDRAISING EXP 
and AGE are significantly positive, while the coefficient on PRICE is significantly negative. In 
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Determinants of Internal Control Deficiencies 


TABLE 5 


Panel А: Financial Statement Reportable Conditions 


Pred. 
Variable Sign 





Intercept 
COMPLEXITY 
GOINGCONCERNRISK 
SURPLUS 

SIZE 

GROWTH 

RISK 
NEWGRANTEE 
BIG4 

REGIONAL 
SPECIALIST 
Industry Indicators 
Year Indicators 


No. of Observations Used 


No. of Observations with 
Internal Control Deficiencies 


Likelihood Ratio 
Percent Concordant 


Panel B: Any Reportable Conditions 





Pred. 
Variable Sign 
Intercept 
COMPLEXITY + 
GOINGCONCERNRISK + 
SURPLUS = 
SIZE = 
The Accounting Review 


American Accounting Association 


Full Sample 


—1.080*** 
(398.828) 
0.081%%% 
(163.447) 
0.737*** 
(455.994) 
—0.068*** 
(48.240) 
—0,029*** 
(91.230) 
0.042*** 
(15.460) 
0.711%%% 
(5751.825) 
-0.032%% 
(4.270) 
-0.382%%ж 
(316.585) 
0.280*** 
(346.814) 
—0.145*** 
(109.964) 
Included 
Included 
127,236 
18,859 


10356.535 
«0.0001 
71.90% 


Full 
Sample 


—1.385%** 


(733.490) 
0.046 
(58.822) 
0.705%%% 
(424.905) 
—0.081 
(75.011) 
—0.008*** 
(8.672) 


Public Support 
Sample 


—1.086*** 
(117.427) 
0.045;*** 
(7.455) 
0.882; 
(173.643) 

— 0.176 ## 
(111.2671 
—0.014#** 

| 





(6.670; 
0.052; 
(5.9301 
0.726*** 
(20960.7321 
— 0.088 ж 
(8.624 

– 0.367 +% 
(150.812) 
0,290 #% 
(152.181] 
—0.009 
(0.172) 
Included 
Included 
47,281 
6,515 





3506.548 
«0.0001 
71.3096 


Public Support 
Sample 
—1.562%%| 
(275.569) 
0.046% ЈЕ 
(8.833) 
0.815%%ҢЕ 
(149.436) 
—0.148** 
(87.255) 
0.010* 
(3.821) 


Ga 











Government 
Sample 


—0.968*** 
(161.340) 
0.029** 
(5.992) 
0.801*** 
(238.627) 
—0.123*** 
(78.667) 
—0.025*** 
(34.136) 
0.042*** 
(7.260) 
0.745*** 
(3218.747) 
-0.083%%% 
(9.810) 
—0.349*** 
(142.823) 
0.243%*#* 
(132.915) 
-0.102%%ж 
(25.931) 
Included 
Included 
65,364 
9,339 


4857.398 
«0.0001 
70.90% 


Government 
Sample 


—1.334*** 
(340.810) 
—0.010 
(0.749) 

0.786 
(232.546) 

—0.120*** 
(83.362) 
—0.002 
(0.338) 


ycontinued on next page) 
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Panel B: Any Reportable Conditions 


Pred. Full Public Support Government 
Variable Sign Sample Sample Sample 
GROWTH + 0.046 0.061*** 0.048*** 
(19.673) (8.864) (10.339) 
RISK t 0.752*** 0.753*** 0.771%%% 
(7243.628) (2531.856) (3820.811) 
NEWGRANTEE + —0.032* —0.049* —0.080*** 
(4.668) (2.893) (9.469) 
BIG4 ? —0.070 —0.103*** —0.089*** 
(15.905) (15.623) (12.412) 
REGIONAL ? 0.288 kk 0.308%%% 0.251*** 
(404.169) (192.368) (155.280) 
SPECIALIST ? —0.083* ** —0.007 —0.049*** 
(40.297) (0.096) (6.993) 
Industry Indicators Included Included Included 
Year Indicators Included Included Included 
No. of Observations 127,236 47,281 65,364 
Used 
No. of Observations 23,996 8,516 11,749 
with Internal Control 4 
Deficiencies 
Likelihood Ratio 11345.371 3954.150 5346.047 
«0.0001 «0.0001 «0.0001 
Percent Concordant 70.70% 70.30% 69.90% 


ж жж ЖЖЖ Indicates statistical significance at the 0.10, 0.05, and 0.01 levels, respectively. 


АП variables are defined in Table 1. We use log form for all continuous variables except GROWTH. Wald Chi-squared 
statistics are reported in parentheses. 


model (1), we include an indicator variable for the existence of a reportable condition over 
financial reporting. The coefficient on RC, FS, | (—0.210) is significantly negative. In model (2), 
we add GOV CONTRIBUTIONS, PROGRAM REVENUE, and PUBLIC SUPPORT, , which are 
all significantly associated with PUBLIC SUPPORT,;. Nevertheless, the coefficient on ЕС FS, , is 
still significantly negative. Note that by including lagged public support, we are essentially esti- 
mating the change in public support associated with disclosing an internal control problem. Our 
evidence suggests that, all else equal, reporting internal control problems over financial reporting 
is associated with 3.8 percent less public support on average. In model (3) of Panel А, we estimate 
the influence of reporting the more severe material weakness over financial reporting on subse- 
quent public support. Again, the coefficient is significantly negative (—0.056). As indicated in 
Table 6, Panel B, we find similar results when we use RC ANY. The evidence in Table 6 is 
consistent with our hypothesis that reportable conditions related to the financial statements have a 
detrimental effect on subsequent support. 

Next, we examine the components of PUBLIC SUPPORT in models (4) and (5). Both Panels 
А and B of Table 6 suggest that that DIRECT SUPPORT is negatively associated with the exis- 
tence of an internal contro] weaknesses. The evidence for INDIRECT SUPPORT is mixed; Panel 
А reports a significantly negative coefficient, while Panel B reports an insignificant coefficient. 
Interestingly, the frequency of internal control problems is not statistically different for firms with 
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direct support compared to firms with indirect support.” Those organizing and contributing to 
federated fund-raising campaigns do not appear to consider internal control problems as part of the 
giving decision in a more sophisticated manner than other donors. 

We also estimate the effect of disclosing a reportable condition over federal program compli- 
ance on subsequent support (untabulated). The coefficient on RC_GOV is negative but not signifi- 
cant (p = 0.16). This result suggests that giving decisions made by individuals, corporations, and 
foundations are highly associated with internal control problems over financial reporting but not 
with internal control problems over federal program compliance. 

We conduct a series of robustness tests on model (2) in Table 6, Panel A and Panel B. We 
control for macroeconomic factors by including GDP and state indicator variables. We address the 
possibility that low financial reserves are associated with both public support and the existence of 
an internal control problem by including GOINGCONCERNRISK. We include SIZE, as well as 
scale all dollar-denominated continuous variables by total revenue. Finally, we change the sample 
selection criteria to include all organizations with public support over $1,000, instead of $100,000. 
For each of these tests, we continue to find negative and significant coefficients on RC FS and 
RC. ANY, suggesting donors directly or indirectly use internal control information in their giving 
decisions. 

Finally, we investigate the association between internal control problems and subsequent 
public support across the six NTEE industries listed in Table 1. We find negative and significant 
coefficients on both RC. FS and RC ANY for Education and Health. The coefficients on internal 
controls problems for the remaining industries are insignificant. According to Wing et al. (2008), 
excluding Religion,” Education receives the largest percentage of total public support, followed 
next by Health. Thus, it appears that internal control problems influence industries that receive 
substantial contributions from donors. 


Effect of Internal Control Deficiencies on Government Contributions 


Table 7 reports the results from estimating a regression of government contributions on 
RC_FS, RC_ANY, and RC_GOV for the Government Sample. We again include the inverse Mills 
ratio computed using the parameters from Table 5. The coefficients on RC_FS (—0.017) and 
ЕС ANY (—0.017) are negative and significant. Unlike with public support, the coefficient on 
ЕС GOV (—0.021) is also significantly negative. These results suggest that government agencies 
do use information regarding internal controls over federal program compliance to make funding 
decisions. When we use material weaknesses instead of reportable conditions, only the coefficient 
on MW. GOV is significant. 

In addition, the coefficient on FUNDRAISING EXP is significantly negative, consistent with 
Andreoni and Payne (2003). The coefficient on LOBBYING is positive, suggesting that organiza- 
tions that engage in activities to influence legislation receive more government grants. The coef- 
ficient on GDP is negative. Overall, government contributions to nonprofit organizations appear to 
be a function of political and economic factors, as well as the organization's perceived ability to 
fulfill its mission as signaled by the results of its internal control audit. 

The results in Table 7 are consistent using alternative specifications of Equation (3). In 


23 In fact, 20.7 percent of organizations that receive only indirect support disclose an internal control problem, while 19.1 
percent of organizations that receive only direct support disclose a problem. Of the organizations that receive both direct 
and indirect support, 16.8 percent disclose an internal control problem. These percentages suggest that any differential 
effects of internal controls problems on direct and indirect support are not caused by superior selection methods of 

- federated fundraising campaigns (i.e., these campaigns do not select out of giving to organizations with problems). 
Religion receives the largest amount of public support. Most religious organizations are not included in our sample 
because religious institutions are exempt from the Form 990 filing requirement and most do not receive federal funding. 
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particular, we include GOINGCONCERNRISK and SIZE, scale the continuous variables by total 
revenue, and change the sample selection criteria to include all organizations with government 
contributions over $1,000 instead of $100,000. For each of these tests, we continue to find nega- 
tive and significant coefficients on the reportable condition indicators, suggesting that government 
agencies use internal control information in their funding decisions. 

In untabulated results, we next investigate the link between interna! control problems and 
subsequent government contributions for the five most frequent oversight agencies in the Govern- 
ment Sample. The most frequent oversight agencies (percentage of sample) are the Department 
of Health and Human Services (46 percent), the Department of Housing and Urban Development 
(16 percent), the Department of Education (14 percent), the Department ofjAgriculture (5 percent) 
and the Department of Justice (3 percent). No other federal agency represents more than 1 percent 
of the sample. The coefficients on all three measures of reportable conditions are significantly 
negative for organizations overseen by HUD. In addition, the coefficient on reportable conditions 
over financial reporting is significantly negative for organizations overseen by the Department of 
Agriculture. The coefficients on reportable conditions for organizations overseen by HHS, by far 
the most frequent oversight agency, are negative but not significant. ”® Тһе coefficients on report- 
able coefficients for the remaining most frequent agencies are also insignificant. 


VI. CONCLUSION 

This study examines the causes and consequences of internal control weaknesses in nonprofit 
organizations. The nonprofit sector provides a useful setting to examine |the effects of internal 
control disclosures because charitable giving by donors provides direct evidence of stakeholder 
reactions to such disclosures. We first document that the likelihood of reporting an internal control 
problem increases for nonprofit organizations that are complex, growing; smaller, and in poor 
financial health. We then present evidence that weak internal controls overjfinancial reporting are 
negatively associated with subsequent public support and government contributions received after 
controlling for the current level of support and other factors influencing contributions. Thus, 
internal control information appears to affect, either directly or indirectly, the funders’ giving 
decisions. Internal control reporting by nonprofits has been required for two decades. Our results 
are generalizable to the for-profit sector because they provide long-term evidence that stakeholders 
do indeed use internal control information to evaluate organizations. 
Our specific results may interest several constituencies. First, the IRS and other regulators are 
reformulating laws in an attempt to increase public confidence in the integrity of exempt organi- 
zations. Second, donors want to make more informed charitable decisions. Third, watchdog groups 
promulgate standards to evaluate an organization’s effectiveness in achieving its mission. These 
standards may encourage organizations to underinvest in infrastructure in|the short-run. In fact, 
Hager et al. (2004) argue that pressure from donors and watchdog groups|to maximize mission- 
related spending and limit overhead costs to artificially low levels is detrimental in the long-run. 





25 We define the oversight agency as the cognizant agency if a cognizant agency exists; otherwise it is the oversight agency 
designated in the A-133 report. The oversight agency is likely, but not necessarily, the predominant supplier of govern- 
ment contributions. To the extent that a nonprofit organization is complex and receives|contributions from several 
federal agencies or receives most federal funding from contract revenue rather than from zontributions, the oversight 
agency will be less meaningful for purposes of this test. 

26 Recall that Medicare and Medicaid are program service revenue and not part of our tests. Gne possible explanation for 
the lack of significant results for HHS is that the majority of HHS discretionary grant dollars are allocated to scientific 
research (http://www.hhs.gov/grants/). These are generally multiperiod grants. Thus, our tésts may not properly cover 
the decision window for these grants. 
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Our evidence is consistent with the notion that short-term savings on administrative expenses (1.е., 
establishing internal controls) can ultimately have negative consequences on the organization's 
donor support and, thus, the organization's ability to deliver services. 

Finally, nonprofit managers and board members need to understand the risks of failing to meet 
donors' and government agencies' expectations with regards to accountability. We estimate that, 
all else equal, organizations with internal control problems receive 3.8 percent less public support 
and 2.1 percent less government support. According to the National Council on Nonprofits (2009), 
the estimated cost of an audit for an organization with revenue of $600,000 is $12,000—$20,000. 
Audit costs likely increase in the presence of internal control problems (Hogan and Wilkins 2008). 
Thus, in some cases, a cost-benefit analysis indicates an overall benefit from conducting periodic, 
thorough internal reviews of internal controls. If attestations of internal controls by external 
auditors are cost-prohibitive, then nonprofit organizations can seek in-kind support to help them 
improve their internal controls. For example, technology companies often donate technical support 
to nonprofit organizations. Similarly, other corporate donors with Sarbanes-Oxley experience can 
provide guidance on creating and maintaining adequate internal control systems. 

Internal control in the nonprofit sector is a relatively unexplored area for researchers and there 
are many questions left to be addressed. We show that the disclosure of any internal control 
weakness is associated with future declines in public support, but leave open the question of which 
types of donors (i.e., foundations, corporations, or individuals) respond to the internal control. 
information. Likewise, a more refined analysis would assist in determining which federal, state, 
and local agencies use the A-133 data. Finally, further research is needed to investigate how 
internal control weaknesses influence other aspects of a nonprofit organization's operations, in- 
cluding earnings management and executive compensation. 
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MATTHEW GILL, Accountants’ Truth: Knowledge and Ethics in the Financial World (Ox- 
ford, U.K.: Oxford University Press, 2009, ISBN 978-0-19-954714-2, pp. x, 198). 


More than two decades ago, I closed a book on the professions with the remark that “surely accounting is 
today far more socially important than medicine." There were then few studies of the accounting profession and, 
above all, no serious theoretical analyses of accounting as the intellectual activity that both measured and norma- 
tively defined the economic phenomena at the center of modern life. In the years since, there have been a number 
of such studies and indeed a journal (Accounting, Organizations and Society) that has focused attention on such 
questions. Matthew Gill's Accountants' Truth falls in this new tradition. Because the book is written primarily for 
an audience in accounting rather than sociology, however, І am an awkward reviewer. But perhaps the accountant 
reader will find it interesting to know what strikes a sociologist about Gill's book. 

The book describes the intellectual world of a very particular group: 20 accountants, snowball-sampled five 
each from the Big 4's London offices. They are all young men, nearly all ICAEW-qualified, nearly all within a few 
years of their original training. Since many potential interviewees declined to participate, the group is somewhat 
self-selected. Himself an ICAEW accountant, Gill interviewed these 20 in detail, including in his interview 
protocol a discussion of an accounting problem-scenario sent ahead. The book has six substantive chapters: a 
theoretical discussion of performance and truthfulness, an empirical analysis of reactions to the scenario, chapter- 
length discussions of technocratism, pragmatism, and professionalism, and a chapter on ethics. 

Gill argues that accountants inevitably experience a tension between technocratism (the attempt to reduce 
accounting to mere rules, abstracting accounting from accountants) and pragmatism (the decision to bracket 
accounting's "truth" and to understand accounting practice through alternative metaphors such as strategy, sport, 
or family). The problem with rules is that rationalization in accounting simply makes accountants more and more 
alienated from the actual practice of accounting; rules make accountants automata, reducing professional commit- 
ment as well as ambiguity. As a result, "professionalism" becomes for these young men a kind of refuge— 
vacuous, but safe nonetheless—from the ethical problems that their alienated selves can see but lack the desire to 
rectify. “Ethics” becomes a matter of subtly shifting responsibility to the firm (“employee ethics") or onto the law 
(“legal ethics"). Beyond these lies equivocation, yet another technique for handling the alienation inevitable in 
accounting work. Yet in the end Gill argues that this state of affairs is quite understandable, given the actual 
situations of his respondents, and that plans to shift it by the usual means of further rationalization and regulation 
are self-defeating. He devotes a final chapter to potential fixes for alienation, following the late Eliot Freidson— 
longtime dean of the sociologists of professions—in calling for a renewed commitment to a deeper and more 
comprehensive professionalism. 

Although I found the book very interesting, I worry that what may seem interesting to me—the demonstra- 
tions of exactly how it is that accounting works as a particular discourse and how accounting "facts" are 
constructed —may be everyday common sense to the accounting reader. Conversely, the more general claims—for 
example, the claim that knowledge consists of a best stab at representing something about which there can be no 
final truth-—are common sense in sociology, although they may be new to the accountant reader. The book doesn't 
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aim to further develop such general claims, but rather to apply them in the particular world of accounting (that's 
why it seems to me a book for accountants). And although its central idea—that increasing rationalization para- 
doxically increases alienation among accountants and may even facilitate equivocation—is quite interesting, that 
claim cannot (at least for a sociologist) be empirically established or even fully explored on the basis of only 20 
interviews. 

Such a reaction is inevitable in the sociologist reader. Discourse and construction are old news to me, and I 
have never thought that contemporary accountants merely discover facts, although I do know that a century ago 
accountants on-site counted physical inventory. What I wanted in the book was five times as many interviews, 
quota-sampled across a variety of types of accounting, and a much more detailed analysis of how different aspects 
of the discourse and practice of accounting go together differently (or similarly, if tha: is the case) for different 
kinds of accountants. That is, I wanted much more data and the analysis a sociologist would have written. 

But that’s a narrow disciplinary worry. The advantage of a very small sample is that it allows respondents to 
become full and complex individuals. Yet Gill’s narrative does not always take full advantage of that complexity, 
both because he takes his respondents as a group emblematic of accounting in the abstract and because the 
organization of the book around substantive themes (rather than the complexities of individuals’ adaptations) hides 
those very particular complexities from us. As a result, having chosen respondents very closely grouped in age, 
gender, professional seniority, and employer, Gill leaves us wondering how far these results generalize. Perhaps 
more disturbing, he tells us early on that many of these young men will leave for other cccupations in short order; 
accountancy is for them a life stage, a preparation for something else. But given that, how can we take their 
"discourse" as that of accountancy in general? Gill claims that by looking at the core oi the recently trained elite 
he will find the cutting edge of accountants' discourse. True enough, but one cannot help thinking that these are 
very young men, interviewed by a young man, and that their love of the "sport" of tax 'vasion, their appreciation 
for the clever “wheeze,” and their cavalier attitude toward their society and its поп-егопошіс institutions may 
derive from a juvenile masculinity that will wear off as life teaches them a few nasty 1255005. 

The sample raises another issue. It is to be sure a sample of accountants, but it is also a sample of people who 
work for large organizations. The refuge in technocracy and pragmatism, the protective armor of employee ethics 
and avoidance of responsibility—these are the stuff of bureaucratic life. Indeed, Gill often invokes Robert Jackall's 
(1988) brilliant Moral Mazes, which concerns considerably older middle and upper managers of large organiza- 
tions. And like Jackall's managers, Gill’s accountants will be able to “outrun” whatever problems they create, in 
their case by leaving the occupation entirely rather than by new appointments elsewhere. One worries, then, that 
Gill's actual topic may be organization men, not accountants in particular. 

My insistence on surroundings of course reflects my own work about the professions. I wanted to hear much 
more about accountants’ relations with lawyers, tax advisors, business consultants, апа other competitors in this 
work area. The book is very good on relations with clients, but other than a few words about the law, it doesn't tell 
us a lot about the other professions crowding the accountants, nor about those (e.g., business consultants) whom 
the accountants would themselves like to crowd out (a battle that has been going on atl least since the 1920s, at 
least in the U.S.). 

As someone whose theories are rooted in the historical evolution of professions over time, I also wanted 
more background on the history of accounting and on the historical moment (after neolitieralism and the so-called 
Big Bang in English legal affairs) in which these young men found themselves. Indeed, the book's argument that 
rationalization increases alienation cannot, ultimately, be established on the basis of purely contemporary data. It 
requires data over time. 

Finally, it is of the nature of a book written for an accounting audience that it does not ask what, for a 
sociologist, would be the major theoretical questions. But nonetheless an accountant reader might wonder what 
those sociological questions are. For me, they involve the second-order clients or, to put it in John Dewey's term, 
the “public” of accounting. (A Marxist would call it the political economy of accounting.) Although Gill starts us 
with some general words about principals and agents, he moves very quickly into a data analysis that takes the 
larger social situation of his accountants for granted. But public accounting started not because clients sought it, 
but because investors demanded it. In the early years of English public accounting, when 80 percent to 90 percent 
of new joint stock companies failed annually and vast amounts of British capital were being invested in projects 
overseas, investors demanded public accounting as a means of knowing something real about the objects of their 
investments. Early English accounting knew that its principal "client" was the investor, not the firm audited. 
Today, of course, the situation is quite different. It is not clear for which principal accountants are today the agents, 
nor indeed which among the investors—smart money, dumb money—4s the “user” envisioned by accountants 
when they decide to locate this or that exceptional item above or below the line, And of course the Inland Revenue 
has also emerged as a crucial audience for reports, as is the legal system that holds the ultimate power. 

This complex arena in which the various consumers of balance sheets transact and argue with one another is 
never seriously analyzed, by either the respondents or by Gill himself. The general social setting of accounting, 
how it is paid for, who reads it, what are its distributional consequences for the society at large: none of these is 
treated as problematic, perhaps because they are so commonsensical to the presumed audience. But to a sociolo- 
gist, these matters are the main event. The principal force driving the complexities апа difficulties that Gill 
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portrays so well is the simple fact that accountants are providing information on the basis of which some people 
will try to make money at the expense of others. The audited firms that provide them with information, the 
Short-run looter-investors, the long-run disattentive investors, the tax officials aiming to fund a government, the 
workers wondering about pension funding: all these people are trying quite reasonably to acquire money that 
others think is more properly theirs, or theirs for the taking, or whatever. These various people have widely 
varying levels of expertise and radically different time horizons in addition to their complexly conflicting interests 
in the present. Furthermore, they are all involved in a system that rewards those few who, like motorists who 
continue driving in a lane they know will be closed іп a mile or so, free ride on the much larger group that obeys 
the rules to the letter. 

It is this complex network of conflicting interests, antithetical publics, and high stakes that makes accounting 
what it is as a profession and as a part of the complex institutions of modern capitalism. Yet all of this is mostly 
absent from the book, or, perhaps better put, is simply taken for granted. That is, indeed, what most clearly marks 
the book as a book for accountants rather than sociologists. 

I should close with some things I really liked. I suspect that many of them are old news to accountants, but 
they are what struck me as an outsider. I very much liked the insight that transparency can be a way of passing the 
buck. I like the Sennettian insight that accounting needs a public way to talk about ethical issues. I liked the 
discussion of “material importance," with the correlative concepts of user responsibility and caveat lector. I grew 
to like the respondents, whom I eventually started to recognize as old friends, perhaps as Gill intended by 
stretching their evidence out across the whole of his empirical analysis. 

Gill is to be congratulated on a book that opens many of the important issues raised by accounting as a 
practice constitutive of the realities of modern society. But we are still waiting for a sociologically thorough 
theoretical and empirical analysis of those issues. This book is the starter or perhaps the fish course. The main 
course— perhaps it is Gill's next book—is yet to come. 
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to the date of publication. The Accounting Review is published in January, March, 
May, July, September, and November. Position ads in the "Placement Ads" are free 
with the purchase of a job posting in the AAA Career Center. For more information 
on how to purchase a job posting in the Career Center and receive your free position 
ad in The Accounting Review, go to the AAA website at http://aaahq.org and click on 
"Career Center," or call our office at 941-921-7747. 


LOYOLA MARYMOUNT UNIVERSITY, Department of Accounting invites applicants for a full- 
time tenure-track faculty position at Assistant/Associate rank effective Fall 2010. Loyola Mary- 
mount, a comprehensive university in the mainstream of American Catholic higher education, seeks 
professionally outstanding applicants who value its mission and share its commitment to academic 
excellence, the education of the whole person, and the building of a just society. A doctorate in 
Accounting from an AACSB-accredited university is required. АП candidates should demonstrate a 
strong commitment to scholarly research and excellence in teaching. Candidates with interest in 
Audit, Financial Accounting, and Cost areas are encouraged to apply. Salary is competitive and 
research support is available. To learn more about this position, contact Professor Mahmoud 
Nourayi, at (310) 338-5831, or mnourayi@Imu.edu. LMU is an Equal Opportunity Institution ac- 
tively working to promote an intercultural learning community. Women and minorities are encour- 
aged to apply. Visit http://www.lmu.edu for more information. 


BOSTON COLLEGE, Carroll School of Management seeks applicants for a full-time, non-tenure- 
track Lecturer position in the Department of Accounting beginning Fall 2011. Applicants must have 
at least a master's degree or professional certification, excel in teaching at both the undergraduate 
and graduate levels, and be enthusiastic participants in university service activities. Ability to teach 
cost accounting, auditing, or accounting information systems is required. We offer competitive 
salaries and benefits, a collegial environment, an energetic faculty with close ties to both Boston and 
national financial and academic communities, and outstanding students. We seek individuals with an 
openness to diverse intellectual interests and an appreciation for Boston College's Jesuit and liberal 
arts traditions. Applicants should provide a curriculum vitae and recent teaching evaluations to: Billy 
Soo, Chair, Department of Accounting, Carroll School of Management, Boston College, Chestnut 
Hill, MA 02467; or Email: billy.soobc.edu. Applications received by January 15, 2011 will receive 
full consideration. The Carroll School of Management is an Equal Opportunity Employer. 


YALE SCHOOL OF MANAGEMENT is seeking additional faculty members at the junior level in 
the area of accounting. Ph.D. (or final stages of dissertation) and demonstrated potential for high- 
quality research and teaching required; interdisciplinary orientation is preferred. Appointments will 
be made for the 2011—2012 academic year. To apply online, visit: http://mba.yale.edu/faculty/ 
openings.htm. Please note that we are only accepting electronic applications this year. The deadline 
for receipt of all materials is November 30, 2010. Yale is an Equal Opportunity/Affirmative Action 
Employer and especially encourages applications from women and members of minority groups. 
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UNIVERSITY OF WISCONSIN—EAU CLAIRE, Department of Accounting and Finance seeks 
candidates for a probationary tenure-track faculty position at the rank of Assistant Professor of 
Accounting beginning August 22, 2010. Areas of teaching may include financial, managerial, or 
governmental accounting depending on qualifications. Teaching in the M.B.A. program is part of 
this position. Qualifications: Doctorate in accounting or related field must be received no later than 
May 31, 2012. Professional certification (CPA, CMA), evidence of scholarly activity and relevant 
work experience are all considered assets. Applicants should demonstrate capacity for high-quality 
teaching. Application procedure: Only electronic applications are accepted. Send letter of applica- 
tion, graduate school transcript, resume, summary of student evaluations (if available), and contact 
information for three references to: dbecker@uwec.edu; Dr. D'Arcy Becker, Chairperson, Depart- 
ment of Accounting and Finance, University of Wisconsin—Eau Claire, Wiscansin 54702-4004. For 
priority consideration, all application materials must be received by October 15, 2010; screening will 
continue until the position is filled. The University reserves the right to contact additional references 
with notice given to the candidates at an appropriate time in the process. Applicants’ names are 
subject to public release unless confidentiality has been requested in writing. [Names of all finalists 
will be released upon request. A criminal background check is required prior to employment. Doc- 
torate in accounting or related field must be received no later than May 31, 2012. Professional 
certification (CPA, CMA), evidence of scholarly activity, and relevant work experience are all 
considered assets. Applicants should demonstrate capacity for high-quality teaching (as evidenced 
by teaching evaluations if available). UW-—Eau Claire is an AA/EOE, dedicated to enhancing 
diversity. To learn more, visit our website: http://www.uwec.edu/acadaff/jobs/, 


UNIVERSITY OF SOUTH CAROLINA, School of Accounting is acceptiag applications for a 
tenure-track position beginning fall 2011. Candidates should have primary teaching and research 
interests in international accounting, and must possess a doctorate or expect to complete it by August 
2011. In addition to being committed to excellent teaching, candidates must be ambitious and 
motivated researchers who are committed to publishing in top tier accounting journals. Salaries аге 
competitive and commensurate with experience and achievements. All faculty searches are subject 
to the availability of funding. Submit curriculum vitae, working papers, and defended dissertation 
proposal with data to: Professor Al Leitch, Darla Moore School of Business, University of South 
Carolina, Columbia, SC 29208; Email: leitch @ moore.sc.edu. The University of South Carolina is an 
AA/EOE. Minorities and women are strongly encouraged to apply. 


SAINT LOUIS UNIVERSITY, a Catholic, Jesuit institution dedicated to student learning, research, 
health care, and service, is seeking applicants for a tenure-track position in Accounting at the Senior 
Assistant/Associate/Full Professor level, beginning Fall 2011. Candidate must demonstrate a com- 
mitment to teaching excellence and quality research and publication. Service to the department and 
university are also required in addition to teaching and scholarship. Professional certification is 
desirable. Teaching interest is open and all areas will be considered except financial accounting. A 
Ph.D. in Accounting is required. Dissertation-stage candidates could be considered under excep- 
tional circumstances. Compensation package is competitive. Interested candidates must apply online 
at: http://jobs.slu.edu (Req. 20100775); submit a cover letter and current curriculum vitae. All other 
correspondence regarding this position can be sent to: Dr. Ananth Seetharaman, Chair of Account- 
ing, John Cook School of Business, Saint Louis University, 3674 Lindell Boulevard, St. Louis, MO 
63108. Saint Louis University is an Affirmative Action/Equal Opportunity Employer, and encour- 
ages nominations of and application from women and minorities. 
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UNIVERSITY OF MARYLAND, COLLEGE PARK, Robert H. Smith School of Business, Depart- 
ment of Accounting and Information Assurance is seeking two (tenure-track) faculty positions at the 
Assistant Professor level to begin August 2011. For full consideration of all qualified individuals 
interested in the above noted position, applicants must apply online at: http://jobs.umd.edu; and 
submit a letter expressing their interest, along with a curriculum vitae, publications or job paper, and 
evidence of teaching ability by December 31, 2010. Applicants are expected to have a Ph.D., be able 
to generate research publications in top academic journals, and have demonstrated effectiveness in 
the classroom. While all areas of specialization will be considered, managerial accounting or audit- 
ing will be viewed positively. The University of Maryland, College Park actively subscribes to a 
policy of Equal Employment Opportunity, and will not discriminate against any employee or appli- 
cant because of race, age, sex, color, physical or mental handicap, national origin, or political 
affiliation. 


SAN FRANCISCO STATE UNIVERSITY has one tenure-track faculty opening in taxation. Candi- 
dates must have a terminal degree (or a candidate may be in the final stages of dissertation comple- 
tion). A Ph.D. or D.B.A. is preferred but candidates with a J.D. and an L.L.M. in taxation will also 
be considered. Candidates should have a strong commitment to both research and teaching. Profes- 
sional experience or certification is desirable. The position will be filled at a rank commensurate 
with experience. The Department of Accounting offers concentrations in Accounting for B.S., 
M.B.A., and M.S.B.A. degrees. The College of Business is AACSB-accredited at both the under- 
graduate and graduate levels. Accounting is one of eight departments in the College of Business and 
has approximately 17 faculty and 1,200 undergraduate accounting majors. Send a letter of applica- 
tion, curriculum vita, teaching evaluations, and the names and telephone numbers of three references _ 
to: Dr. Jiunn Huang, Chair, Department of Accounting College of Business, San Francisco State 
University, 1600 Holloway Avenue, San Francisco, СА 94132. SFSU is an Affirmative Action/Equal 
Opportunity employer. 


CLARKSON UNIVERSITY seeks an energetic scholar for a tenure-track position at the Assistant ог ` 
Associate level in Accounting, balanced between teaching and research beginning in August 2011. 
АП areas of Accounting will be considered. The position requires a Ph.D. in Accounting, evidence of 
teaching effectiveness and significant research potential. The School of Business has identified 
Supply Chain Management, Entrepreneurship, and Environmental Studies as key strategic directions 
and would particularly welcome applicants with an interest or previous work in any of these areas. 
Clarkson provides a quality environment for research including a two-course per semester teaching 
load for tenure-track faculty, a collegial environment, research, and other financial and nonfinancial 
support. Clarkson University values a culturally diverse faculty and strongly encourages applications 
from women and minority candidates. The University is an Equal Opportunity/Affirmative Action 
Employer. For more information and to apply online, visit the University's website at: http:// 
www.clarkson.edu/hr. 


AUBURN UNIVERSITY MONTGOMERY invites applicants for a nine-month tenure-track posi- 
tion at the Assistant or Associate level in Accountancy. Position begins in Fall 2011 semester. 
Candidates should possess expertise and teaching experience in Tax Accounting. The successful 
candidate will hold a Ph.D. in accounting and experienced applicants should have a well-established 
record of publications and effective teaching. Expectations include commitment to excellence in 
teaching, research, and service. Evidence of strong instructional effectiveness and scholarly produc- 
tivity is essential, as are strong communication and interpersonal skills. Business experience and 
professional certification are highly desirable. Ph.D. in Accounting; Experience in teaching tax; CPA 
certification highly desirable. Please contact Dr. Keren Deal at (334) 244-3227 or kdeal@aum.edu 
for more information. 
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CENTRAL COLLEGE seeks applications for a tenure-line position as an Assistant Professor of 
Accounting. Minimum qualifications include М.В.А./М.А. in accounting with CPA/CMA designa- 
tion. Candidates with college teaching experience and/or holding a Ph.D. in accounting will be given 
preference. Please contact Marilyn Vrban at (641) 628-5175 or vrbanm Gcentral.edu if you have any 
questions regarding the open position. Visit https://www.central.edu/jobseekers for full position de- 
scription and application process. AA/EOE. 


COLLEGE OF THE HOLY CROSS, Economics/Accounting Department invites applications for a 
tenure-track position beginning August 2011. The Department seeks to hire|an accountant whose 
teaching and research interests might include managerial, auditing, international, or professional 
ethics. Ph.D. or proximate completion required. Strong candidates will demonstrate a dedication to 
excellent teaching within a liberal arts environment and a commitment to rescarch, publication, and 
service. This position carries a 3-2 teaching load with a full-salary, one-semestzr research leave prior 
to tenure review and generous sabbatical and fellowship leaves for senior faculty. Submit cover 
letter, curriculum vita, three recommendation letters, research paper, statement of teaching philoso- 
phy, and undérgraduate and graduate transcripts by December 15, 2010. Тһе College of the Holy 
Cross is a highly selective Catholic liberal arts college in the Jesuit tradition. It enrolls about 2,900 
students and is located in a medium-sized city 45 miles west of Boston. Holy Cross belongs to the 
Colleges of Worcester Consortium (http://www.cowc.org) and the New England Higher Education 
Recruitment Consortium (http://www.faculty.harvard.edu/01/013.html). Contact: Kolleen Rask, Eco- 
nomics Chair, Campus Box 45A, College of the Holy Cross, Worcester, MA 01610-2395. Website: 
http://www.holycross.edu/departments/economics/website/. The College complies with all Federal 
and Massachusetts laws concerning Equal Opportunity and Affirmative Action in the workplace. 


MARIST COLLEGE, School of Management at seeks an Assistant/Associate Professor of Account- 
ing beginning Fall 2011. This tenure-track position will involve teaching beth undergraduate and 
graduate courses (including online courses). The business and accounting programs are fully accred- 
ited by AACSB International. Candidates should have an earned doctorate in accounting or a related 
field. Professional experience and certification (CPA, CMA, CIA, CISA, etc.) are also desirable. 
Marist College is a highly selective, independent, liberal arts institution located in the historic 
Hudson River Valley of New York. Marist has been recognized for excelleace by U.S. News & 
World Report, The Princeton Review, Kiplinger's Personal Finance, Entrepreneur Magazine, and 
Barron's Best Buys in College Education, and is noted for its leadership in thé use of technology to 
enhance the teaching and learning process. To learn more or to apply, please visit https:// 
jobs.marist.edu. Only online applications are accepted. EOE/AA. - 


THE UNIVERSITY OF NORTH CAROLINA AT GREENSBORO, Department of Accounting and 
Finance invites applicants for a tenured or tenure-track position starting fall 2011. The academic 
rank is open and the salary competitive. Greensboro is located in the Piedmont Triad region near the 
Research Triangle and other educational institutions. The Department consists of 13 full-time faculty 
and offers B.S. degrees-and a M.S. degree in Accounting. The Bryan School is accredited by 
AACSB International in both business and accounting. The Department has sübscriptions to CRSP, 
Compustat, and WRDS. A doctorate (or short-term ABD status) in Accountirg from an accredited 
university is required. We will consider all areas but preference will be given to individuals with 
research and teaching interests in financial accounting and/or tax. For full consideration, submit a 
letter of interest, curriculum vita, and references to Accounting Search Committee. at: 
ACCFIN Quncg.edu; ог by mail to: UNCG, Bryan School of Business and |Economics, PO Box 
26165, Greensboro, NC 27402-6165. Applications will be accepted until the position is filled. AA/ 
EOE. Moreover, UNCG is committed to recruitment and advancing women and minorities at all 
faculty/staff levels. 
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THE UNIVERSITY OF NORTH CAROLINA AT CHARLOTTE, Belk College of Business invites 
applications for full-time, tenure-track position(s) in Accounting. АП ranks will be considered. А 
Ph.D. degree from an accredited school in the field of accounting or closely related discipline is 
required. А strong commitment to excellence in high-quality, innovative research and teaching at the 
undergraduate and graduate levels is expected. Candidates at the Associate or Professor level must 
have an outstanding research record commensurate with the level of appointment. Specializations in 
all accounting subfields are encouraged to apply. Teaching load and salary are competitive with 
research universities. The starting date is August 2011. АП applications will be considered strictly 
confidential. The review of applications will begin immediately and applications will be accepted 
until the position is filled. Apply electronically at: https://jobs.uncc.edu. If you have any questions 
concerning this position(s), please contact: Professor Jack Cathey, Chair of the Search Committee, at 
jmcathey @uncc.edu or (704) 687-7690, Only electronic submissions will be accepted. Please attach 
the following documents with your electronic submission: application letter and vita. Finalists will 
be asked to forward official transcripts, letters of reference, and other supportive materials as re- 
quested by the search committee. The University of North Carolina at Charlotte is an Affirmative 
Action/Equal Opportunity Employer. Women, members of minority groups, and persons with dis- 
abilities are encouraged to apply. 


LAMAR UNIVERSITY, Department of Accounting and Business Law seeks a tenure-track Assis- 
tant or Associate Professor of Accounting. Accounting information systems or tax is preferred, but 
not required; however, must be able to teach in several content areas at both graduate and under- 
graduate level. Should demonstrate a record of excellence in (or potential for) innovative teaching, 
quality published/publishable research, service to the university/eommunity, as well as a desire to · 
build a quality accounting program. Official transcripts are required at the time of employment. 
Positions will be filled pending available funding. For a detailed job description and application 
process visit: https://jobs.lamar.edu/postings/134. This position is security-sensitive and thereby sub- 


ject to the provisions of the Texas Education Code $51.215, which authorizes the employer to obtain 
criminal history record information. The successful candidate must have a Ph.D. (or ABD near 
completion) from an AACSB-accredited program or from an institution with an equivalent reputa- 
tion. Individuals with an L.L.M. in Taxation and a commitment to research will also be considered. 
AA/EOE/ADA. Lamar is committed to excellence through diversity. 


UNIVERSITY OF MELBOURNE, Melbourne Business School (MBS), which provides the M.B.A. 
and E.M.B.A. programs for the University, invites applications for a faculty position in financial 
accounting at any rank to begin September 2011 or January 2012. MBS offers a multi-disciplinary 
environment, as well as interaction with colleagues at the Faculty of Business and Economics within 
the wider University of Melbourne community. MBS faculty regularly publish in top-tier interna- 
tional journals. Salary, benefits, and teaching loads are competitive with U.S. and other leading 
international business schools. Research support is excellent. Candidates should have or be close 
to completing a Ph.D. in Accounting. Consistent with rank, applicants will be evaluated on their 
potential for and/or record of excellence in research and teaching. Graduate teaching experience 
is desirable, but not required. Applicants should submit a cv, working paper, teaching evalua- 
tions (where applicable), and three reference letters (or names of three references) to: 
j.frederickson 9 mbs.edu. 
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BALL STATE UNIVERSITY seeks candidates for a tenure-track position as Assistant or Associate 
Professor of Accounting, starting August 22, 2011. Responsibilities include; teaching courses in 
financial, managerial, governmental and not-for-profit, international, and/or accounting information 
systems at the undergraduate and/or graduate levels and appropriate scholarship and service. Mini- 
mum qualifications: ABD toward doctorate in accounting or other business relàted field, completion 
of degree by November 1, 2012; evidence of scholarly research; teaching experience in accounting 
or other business related field at the university level. Preferred qualifications: Ph.D. or D.B.A. in 
accounting; full-time teaching experience at the university level; significant record of publications in 
refereed journals in the accounting area and/or evidence of other scholarly contributions; and CPA or 
CMA certificate. Competitive salary and benefits package. Send letter of application, curriculum 
vitae, official transcripts, and the names of three references to: Dr. Lucinda L. Van Alst, Chair, 
Department of Accounting, Ball State University, WB 303, Muncie, IN 47306. Review of applica- 
tions will begin immediately and will continue until the position is filled. Visit http://www.bsu.edu. 
Ball State University is an Equal Opportunity/Affirmative Action Employer and is strongly and 
actively committed to diversity within its community. 





THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL, Kenan-Flagler Business School 
is seeking to fill one or more tenure-track, tenured, or fixed-term faculty positions in the Accounting 
area starting July 1, 2011. Successful candidates will have strong research skills and be expected to 
publish in top-tier academic journals. Positions require a Ph.D. granted, or nearly completed, in the 
academic field identified. or in a related field. Hired candidates with a Ph.D. can anticipate an initial 
appointment of Assistant Professor, Associate Professor, or Professor. Hired candidates with a Ph.D. 
nearly completed can anticipate an initial appointment of Instructor. The successful candidate will be 
both a productive researcher, and a creative and effective teacher capable of contributing to the 
School’s Undergraduate, M.B.A., Master’s of Accounting, Ph.D., and Executive Development Pro- 
grams. Qualified applicants are to complete the application process online at http://jobs.unc.edu/ 
2500448, in order to be considered. Full submissions should include the following: curriculum vitae, 
'sample research papers, example of teaching effectiveness (e.g., teaching evaluations, comments 
from students), and four original letters of recommendation. Please also uplaad letters of recom- 
mendation in the web system when applying on this recruitment. For additionallinformation, send an 
email to: AccountingRecruiting@unc.edu. Original materials should be mailed to: Kenan-Flagler 
Business School, University of North Carolina at Chapel Hill, CB 3490, McColl Building, Chapel 
Hill, NC 27599-3490, Attn: Accounting. The University of North Carolina at Chapel Hill is an Equal 
Opportunity Employer. . 


FINANCIAL/ACCOUNTING/INSURANCE/TAX TRANSLATOR with potential earnings in ex- 
cess of $125,000 (€90,000) per year. An international translation agency headquartered in Paris, 
France, and specializing in financial and legal translations for blue-chip clients is constantly seeking 
to hire home-based translators and editors. Candidate should be: experienced and/or qualified ac- 
counting, finance, insurance, or tax professional; able to write clearly and сопгізеіу in your native 
language; capable of meeting demanding deadlines; fluent in/with good wotking knowledge of 
French (and/or one other major European language) in addition to your native language, with good 
comprehension of the foreign-language terminology in your subject area. Looking for a career 
change and the opportunity to work from home at hours that suit you? Send your resume and cover 
letter, or any inquiries, to: recrutement 'tectrad.com. 
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UNIVERSITY OF NOTRE DAME, Department of Accountancy invites applications for position 
openings at all levels: Assistant, Associate, or Full Professor. All research areas will be considered. 
The appointment is expected to begin July 1, 2011. Applicants for Assistant Professor positions 
should expect to complete doctoral degree requirements before the start of the contract period. АП 
candidates should have a strong commitment to scholarly research and excellence in teaching. 
Salary is very competitive and research support is excellent. Please submit curriculum vita, three 
letters of recommendation, teaching evaluations, and an example of scholarly work to: 
acctdept@nd.edu. Also, address inquiries to: acctdept@nd.edu. The deadline for applications is 
January 15, 2011. The University of Notre Dame is an Affirmative Action Employer with a strong 
commitment to fostering a culturally diverse atmosphere for faculty, staff, and students. Women, 
minorities, and those attracted to a university with a Catholic identity are especially encouraged to 
apply. For more information about the University of Notre Dame, the Mendoza College of Business, 
and the Department of Accountancy please visit the department's website at: http://business.nd.edu/ 
accountancy/. University of Houston—Downtown, FACIS Department, College of Business is seek- 
ing applicants for one tenure-track position at the rank of Assistant Professor and another at the rank 
of Associate Professor, which, depending on credentials, can include a named professorship. While 
the Search Committee will accept applications until the positions are filled, interested individuals are 
encouraged to apply by January 1, 2011 to assure full consideration. Online applications are required 
at: https://jobs.uhd.edu/applicants. UHD enrolls over 12,000 undergraduate and graduate students 
and offers daytime, evening, and weekend courses through both classroom and various technology- 
enabled mediums. UHD has received top rankings by U.S. News & World Report for wireless 
Internet accessibility and.for the diversity of its student body. Houston is the nation's fourth most 
populous city and second lowest in terms of cost of living among major American cities. The city is 
home to 23 Fortune 500 companies and many other world-class institutions in the areas of energy, 
medicine, aerospace, manufacturing, and business services. The University's proximity to downtown 
enables excellent access to these and the many other resources available in Houston and the sur- 
rounding region. АА/ЕОЕ. 


UNIVERSITY OF MINNESOTA—DULUTH, Department of Accounting, Labovitz School of Busi- 
ness and Economics has a full-time, nine-month, tenure-track, Associate/Assistant/Instructor posi- 
tion available beginning Fall 2011. Job duties and responsibilities include teaching, research, and 
service. The qualified individual will develop and teach accounting courses in auditing, as well as 
either financial or managerial accounting areas. Essentíal qualifications include ABD (with a clear 
plan for completion by September 1, 2012) from a doctoral program in Accounting or Business 
Administration with a concentration in Accounting, from an AACSB International-accredited school 
or major internationally recognized university. For a complete position description and information 
on how to apply online, visit http://employment.umn.edu/; search for Job Requisition 169070. Com- 
plete applications will be reviewed beginning February 15, 2011, and continue until the position is 
filled. The University of Minnesota is an Equal Opportunity Educator and Employer. 
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THE UNIVERSITY OF TEXAS AT ARLINGTON, Department of Accounting invites applications 
for an Endowed Chair (Full Professor) or Endowed Professorship (Associate or Full Professor) 
position to begin Fall 2011. Job duties include mentoring doctoral studerts and junior faculty, 
leading and conducting high-impact accounting research, teaching undergraduate and graduate stu- 
dents, and providing academic and professional service. Candidates should held a Ph.D. in Account- 
ing, have an outstanding record of scholarly research and publications, an|established record of 
teaching excellence, and a promise of future contributions of an exceptional nature. Professional 
certification and accounting experience are desirable. Compensation is competitive according to 
qualifications and experience. Review of applications will begin immediatlyand continue until the 
position is filled. This is a security-sensitive position, and a criminal background check will be 
conducted on finalists. Interested persons should send a current cv, the names and addresses of at 
least three academic references, demonstrated evidence of excellence in research and teaching, and 
representative publications or working paper to: Dr. Jennifer Ho, Search Chair Committee, Depart- 
ment of Accounting, College of Business, Box 19468, The University of Texds at Arlington, Arling- 
ton, TX 76019. The University of Texas at Arlington does not discriminate on the basis of race, 
color, national origin, sex, religion, age, disability, veteran status, or sexual prientation in employ- 
ment or in the provision of services. Effective August 1, 2011, the use of tobacco products (including 
cigarettes, cigars, pipes, smokeless tobacco, and other tobacco products) by students, faculty, staff, 
and visitors is prohibited on all UT Arlington properties. 





WASHINGTON UNIVERSITY IN ST. LOUIS, Olin Business School invites applicants to fill fac- 
ulty positions in the area of Accounting beginning in Fall 2011 to teach courses in the school's 
Bachelor's, Master's, and /or Ph.D. programs, conduct research, publish in peer-reviewed journals 
and participate in faculty service activities. Applicants at the Assistant Professor level should have 
high research potential and have earned their Ph.D. or be close to completion. Seasoned candidates 
must have a demonstrated outstanding research record and be highly visible in the profession. 
Applications are being accepted through the Washington University іп St. Louis Faculty Recruitment 
link. Please visit our website: https://jobs.wustl.edu and review job # 21014 for further information 
and to apply. Reviews will begin immediately and will continue until the positional is filled. (Hiring 
is contingent on available funding.) Olin Business School fosters and appreciztes ethnic and cultural 
diversity among its faculty, students, and staff. Applications from men, women, ethnic minorities, 
veterans, dual-career couples, and individuals with disabilities are encouraged. 
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CORNERSTONE RESEARCH 


Who We Are 
Cornerstone Research is a leading economics and financial consulting firm with] more than 400 full-time 
staff members across six offices. Together with an extensive network of nationally prominent faculty who 
testify as expert witnesses, our staff analyze complex economic, financial, accounting, and marketing issues 
that arise in litigation. Our culture of growth and collegiality provides excellent гагеет prospects to those 
who have pursued doctoral studies in accounting, economics, finance, or marketing. 





Cornerstone Research is involved іп a broad variety of projects, including high-profile legal cases and 
disputes. Our recent cases have involved a wide variety of accounting and auditing issues, including those 
. related to subprime mortgages, securitizations, fair value, derivative and hedge accounting, revenue 
recognition, and SOX and other regulatory compliance. Our cases have included allegations against 
accounting firmis, investment banks, corporations, directors, and officers. We Eave experience in such 
diverse industries as financial institutions, pharmaceuticals, healthcare, energy, telecommunications, high 
technology, traditional manufacturing, and retailing. More detail can be found on our Web site at 
WWw.cornerstone.com. : 


Associates at Cornerstone Research 3 
Cornerstone Research provides an interesting and rewarding work environment. Ош staff works directly 
with esteemed faculty experts in a distinctive partnership that combines the strengths of the academic and 
business worlds. Cornerstone Research provides opportunities for associates to develop as testifying experts 
if they are interested in such a career path. 


Associates are central to casework at Cornerstone Research and are involved in ай phases of a project. In 
the initial stáges of a case, associates actively participate in the formulation of the work plan and the 
analysis of issues. As the case progresses, associates handle the complex aspects df a case directly, while 
managing and advising analysts. They work with senior staff, experts, and clients to develop case strategy 
and to determine how best to communicate our findings. In addition, associates! actively participate in 
shaping and implementing the firm's recruiting, training, and practice development strategies. Overall, 
Cornerstone offers an excellent package of interesting work; exposure to a broad set of issues in finance, 
economics, and accounting; and a high degree of personal responsibility. : 


Candidate Profile 
We seek candidates who have pursued doctoral studies in accounting with the ability to apply academic 
research to real-world issues and present concise explanations of complex analyses. The ideal candidate will 
possess a strong empirical background in financial accounting, cost accounting, jor auditing; technical 
expertise in accounting; and excellent interpersonal skills. CPAs are strongly encouraged to apply. 


Interested candidates should submit a cover letter (that includes a statement describing your interest in 
economics and financial consulting and your location preferences), resume, and a sample research paper to 
http://associatecareer.cornerstone.com. In addition, your application will only be considered complete if it 
includes three letters of recommendation; please have those sent to us using the information below. Please 
be sure to rank our offices in order of your location preference, if any. We have offices in Boston, Los 
Angeles, Menlo Park (CA), New York, San Francisco, and Washington, DC. 





Contact 
Recruiting Administrator - ТАК 
. Cornerstone Research 
599 Lexington Avenue, 43rd Floor 
New York, NY 10022 
Online Application URL: hitp://associatecareer.cornerstone.com 
E-mail: associate-recruiting@cornerstone.com 


Cornerstone Research is an Equal Opportunity Employer 


Boston, MA * Los Алдеіев, СА • MenloPark,CA • New York, NY * San Francisco, СА! * Washington, DC 





THE ACCOUNTING REVIEW | PERIODICALS POSTAGE PAID 
ISSN 0001 -4826 i SARASOTA FLORIDA 
(AND ADDITIONAL MAILING OFFIC 
POSTMASTER ` | | i | | 
Send Address Changes To: ` 


AMERICAN ACCOUNTING ASSOCIATION 
5717 BESSIE DRIVE 
SARASOTA, FL 34233 


